Sciweavers

88 search results - page 6 / 18
» Process Model for Composing High-quality Text Corpora
Sort
View
LREC
2010
195views Education» more  LREC 2010»
13 years 8 months ago
Adapting Chinese Word Segmentation for Machine Translation Based on Short Units
In Chinese texts, words composed of single or multiple characters are not separated by spaces, unlike most western languages. Therefore Chinese word segmentation is considered an ...
Yiou Wang, Kiyotaka Uchimoto, Jun'ichi Kazama, Can...
ICA
2007
Springer
13 years 11 months ago
Text Clustering on Latent Thematic Spaces: Variants, Strengths and Weaknesses
Deriving a thematically meaningful partition of an unlabeled document corpus is a challenging task. In this context, the use of document representations based on latent thematic ge...
Xavier Sevillano, Germán Cobo, Francesc Al&...
FLAIRS
2006
13 years 8 months ago
Constructing a Corpus-based Ontology Using Model Bias
Recent work in lexical resource construction has recognized the importance of contextualizing the knowledge in existing resources and ontologies with information derived from text...
Anna Rumshisky, Patrick Hanks, Catherine Havasi, J...
EMNLP
2008
13 years 8 months ago
Learning Graph Walk Based Similarity Measures for Parsed Text
We consider a parsed text corpus as an instance of a labelled directed graph, where nodes represent words and weighted directed edges represent the syntactic relations between the...
Einat Minkov, William W. Cohen
EMNLP
2009
13 years 5 months ago
Polylingual Topic Models
Topic models are a useful tool for analyzing large text collections, but have previously been applied in only monolingual, or at most bilingual, contexts. Meanwhile, massive colle...
David M. Mimno, Hanna M. Wallach, Jason Naradowsky...