Sciweavers

606 search results - page 10 / 122
» words 2002
Sort
View
CORR
2002
Springer
118views Education» more  CORR 2002»
13 years 9 months ago
Unsupervised discovery of morphologically related words based on orthographic and semantic similarity
We present an algorithm that takes an unannotated corpus as its input, and returns a ranked list of probable morphologically related pairs as its output. The algorithm tries to di...
Marco Baroni, Johannes Matiasek, Harald Trost
COLING
2002
13 years 9 months ago
Investigating the Relationship between Word Segmentation Performance and Retrieval Performance in Chinese IR
It is commonly believed that word segmentation accuracy is monotonically related to retrieval performance in Chinese information retrieval. In this paper we show that, for Chinese...
Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick ...
ITA
2002
13 years 9 months ago
Density of Critical Factorizations
Abstract. We investigate the density of critical factorizations of infinte sequences of words. The density of critical factorizations of a word is the ratio between the number of p...
Tero Harju, Dirk Nowotka
COLING
2002
13 years 9 months ago
Unsupervised Word Sense Disambiguation Using Bilingual Comparable Corpora
An unsupervised method for word sense disambiguation using a bilingual comparable corpus was developed. First, it extracts statistically significant pairs of related words from th...
Hiroyuki Kaji, Yasutsugu Morimoto
COLING
2002
13 years 9 months ago
Unknown Word Extraction for Chinese Documents
There is no blank to mark word boundaries in Chinese text. As a result, identifying words is difficult, because of segmentation ambiguities and occurrences of unknown words. Conve...
Keh-Jiann Chen, Wei-Yun Ma