Sciweavers

88 search results - page 2 / 18
» Process Model for Composing High-quality Text Corpora
Sort
View
ACL
2009
13 years 5 months ago
Active Learning for Multilingual Statistical Machine Translation
Statistical machine translation (SMT) models require bilingual corpora for training, and these corpora are often multilingual with parallel text in multiple languages simultaneous...
Gholamreza Haffari, Anoop Sarkar
EACL
2006
ACL Anthology
13 years 8 months ago
Multilingual Term Extraction from Domain-specific Corpora Using Morphological Structure
Morphologically complex terms composed from Greek or Latin elements are frequent in scientific and technical texts. Word forming units are thus relevant cues for the identificatio...
Delphine Bernhard
KDD
2010
ACM
233views Data Mining» more  KDD 2010»
13 years 11 months ago
Evolutionary hierarchical dirichlet processes for multiple correlated time-varying corpora
Mining cluster evolution from multiple correlated time-varying text corpora is important in exploratory text analytics. In this paper, we propose an approach called evolutionary h...
Jianwen Zhang, Yangqiu Song, Changshui Zhang, Shix...
TOIS
2010
128views more  TOIS 2010»
13 years 5 months ago
Learning author-topic models from text corpora
We propose a new unsupervised learning technique for extracting information about authors and topics from large text collections. We model documents as if they were generated by a...
Michal Rosen-Zvi, Chaitanya Chemudugunta, Thomas L...
SIGIR
2006
ACM
14 years 1 months ago
Improving the estimation of relevance models using large external corpora
Information retrieval algorithms leverage various collection statistics to improve performance. Because these statistics are often computed on a relatively small evaluation corpus...
Fernando Diaz, Donald Metzler