Comparable Corpora | Sciweavers

228

CLEF
2011
Springer

255views Information Technology» more CLEF 2011»

A Language-Independent Approach to Identify the Named Entities in Under-Resourced Languages and Clustering Multilingual Document

14 years 6 months ago

Download web2py.iiit.ac.in

Abstract. This paper presents a language-independent Multilingual Document Clustering (MDC) approach on comparable corpora. Named entites (NEs) such as persons, locations, organiza...

N. Kiran Kumar, G. S. K. Santosh, Vasudeva Varma

claim paper

Read More »

192

click to vote

CIKM
2011
Springer

220views Information Technology» more CIKM 2011»

Mining entity translations from comparable corpora: a holistic graph mapping approach

14 years 6 months ago

Download www.postech.ac.kr

This paper addresses the problem of mining named entity translations from comparable corpora, speciﬁcally, mining English and Chinese named entity translation. We ﬁrst observe...

Jinhan Kim, Long Jiang, Seung-won Hwang, Young-In ...

claim paper

Read More »

316

click to vote

IRFC
2011
Springer

373views Information Technology» more IRFC 2011»

Multilingual Document Clustering Using Wikipedia as External Knowledge

14 years 10 months ago

Download web2py.iiit.ac.in

This paper presents Multilingual Document Clustering (MDC) on comparable corpora. Wikipedia, a structured multilingual knowledge base, has been highly exploited in many monolingual...

N. Kiran Kumar, G. S. K. Santosh, Vasudeva Varma

claim paper

Read More »

184

click to vote

ACL
2011

203views Computational Linguistics» more ACL 2011»

Domain Adaptation for Machine Translation by Mining Unseen Words

14 years 10 months ago

Download www.umiacs.umd.edu

We show that unseen words account for a large part of the translation error when moving to new domains. Using an extension of a recent approach to mining translations from compara...

Hal Daumé III, Jagadeesh Jagarlamudi

claim paper

Read More »

212

click to vote

ACL
2011

211views Computational Linguistics» more ACL 2011»

Rare Word Translation Extraction from Aligned Comparable Documents

14 years 10 months ago

Download eprochasson.free.fr

We present a ﬁrst known result of high precision rare word bilingual extraction from comparable corpora, using aligned comparable documents and supervised classiﬁcation. We in...

Emmanuel Prochasson, Pascale Fung

claim paper

Read More »

169

click to vote

COLING
2010

191views Computational Linguistics» more COLING 2010»

Mining Large-scale Comparable Corpora from Chinese-English News Collections

15 years 1 months ago

Download www.aclweb.org

In this paper, we explore a CLIR-based approach to construct large-scale Chinese-English comparable corpora, which is valuable for translation knowledge mining. The initial source...

Degen Huang, Lian Zhao, Lishuang Li, Haitao Yu

claim paper

Read More »

175

click to vote

EACL
2009
ACL Anthology

109views Natural Language Processing» more EACL 2009»

MINT: A Method for Effective and Scalable Mining of Named Entity Transliterations from Large Comparable Corpora

15 years 4 months ago

Download research.microsoft.com

In this paper, we address the problem of mining transliterations of Named Entities (NEs) from large comparable corpora. We leverage the empirical fact that multilingual news artic...

Raghavendra Udupa, K. Saravanan, A. Kumaran, Jagad...

claim paper

Read More »

192

click to vote

INLG
2010
Springer

128views Natural Language Processing» more INLG 2010»

Extracting Parallel Fragments from Comparable Corpora for Data-to-text Generation

15 years 4 months ago

Download www.aclweb.org

Building NLG systems, in particular statistical ones, requires parallel data (paired inputs and outputs) which do not generally occur naturally. In this paper, we investigate the ...

Anja Belz, Eric Kow

claim paper

Read More »

179

click to vote

COLING
2002

164views Computational Linguistics» more COLING 2002»

Looking for Candidate Translational Equivalents in Specialized, Comparable Corpora

15 years 6 months ago

Download www.limsi.fr

Previous attempts at identifying translational equivalents in comparable corpora have dealt with very large `general language' corpora and words. We address this task in a sp...

Yun-Chuang Chiao, Pierre Zweigenbaum

claim paper

Read More »

183

click to vote

MT
2006

95views more MT 2006»

Finding translations for low-frequency words in comparable corpora

15 years 6 months ago

Download clg.wlv.ac.uk

Abstract. The paper proposes a method to improve the extraction of lowfrequency translation equivalents from comparable corpora. Prior to performing the mapping between vector spac...

Viktor Pekar, Ruslan Mitkov, Dimitar Blagoev, Andr...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers