Sciweavers

CLEF
2011
Springer
13 years 7 days ago
A Language-Independent Approach to Identify the Named Entities in Under-Resourced Languages and Clustering Multilingual Document
Abstract. This paper presents a language-independent Multilingual Document Clustering (MDC) approach on comparable corpora. Named entites (NEs) such as persons, locations, organiza...
N. Kiran Kumar, G. S. K. Santosh, Vasudeva Varma
CIKM
2011
Springer
13 years 12 days ago
Mining entity translations from comparable corpora: a holistic graph mapping approach
This paper addresses the problem of mining named entity translations from comparable corpora, specifically, mining English and Chinese named entity translation. We first observe...
Jinhan Kim, Long Jiang, Seung-won Hwang, Young-In ...
IRFC
2011
Springer
13 years 3 months ago
Multilingual Document Clustering Using Wikipedia as External Knowledge
This paper presents Multilingual Document Clustering (MDC) on comparable corpora. Wikipedia, a structured multilingual knowledge base, has been highly exploited in many monolingual...
N. Kiran Kumar, G. S. K. Santosh, Vasudeva Varma
ACL
2011
13 years 4 months ago
Domain Adaptation for Machine Translation by Mining Unseen Words
We show that unseen words account for a large part of the translation error when moving to new domains. Using an extension of a recent approach to mining translations from compara...
Hal Daumé III, Jagadeesh Jagarlamudi
ACL
2011
13 years 4 months ago
Rare Word Translation Extraction from Aligned Comparable Documents
We present a first known result of high precision rare word bilingual extraction from comparable corpora, using aligned comparable documents and supervised classification. We in...
Emmanuel Prochasson, Pascale Fung
COLING
2010
13 years 7 months ago
Mining Large-scale Comparable Corpora from Chinese-English News Collections
In this paper, we explore a CLIR-based approach to construct large-scale Chinese-English comparable corpora, which is valuable for translation knowledge mining. The initial source...
Degen Huang, Lian Zhao, Lishuang Li, Haitao Yu
EACL
2009
ACL Anthology
13 years 10 months ago
MINT: A Method for Effective and Scalable Mining of Named Entity Transliterations from Large Comparable Corpora
In this paper, we address the problem of mining transliterations of Named Entities (NEs) from large comparable corpora. We leverage the empirical fact that multilingual news artic...
Raghavendra Udupa, K. Saravanan, A. Kumaran, Jagad...
INLG
2010
Springer
13 years 10 months ago
Extracting Parallel Fragments from Comparable Corpora for Data-to-text Generation
Building NLG systems, in particular statistical ones, requires parallel data (paired inputs and outputs) which do not generally occur naturally. In this paper, we investigate the ...
Anja Belz, Eric Kow
COLING
2002
14 years 8 days ago
Looking for Candidate Translational Equivalents in Specialized, Comparable Corpora
Previous attempts at identifying translational equivalents in comparable corpora have dealt with very large `general language' corpora and words. We address this task in a sp...
Yun-Chuang Chiao, Pierre Zweigenbaum
MT
2006
95views more  MT 2006»
14 years 11 days ago
Finding translations for low-frequency words in comparable corpora
Abstract. The paper proposes a method to improve the extraction of lowfrequency translation equivalents from comparable corpora. Prior to performing the mapping between vector spac...
Viktor Pekar, Ruslan Mitkov, Dimitar Blagoev, Andr...