Search Sciweavers | Sciweavers

183

CORR
2002
Springer

118views Education» more CORR 2002»

Unsupervised discovery of morphologically related words based on orthographic and semantic similarity

15 years 6 months ago

We present an algorithm that takes an unannotated corpus as its input, and returns a ranked list of probable morphologically related pairs as its output. The algorithm tries to di...

Marco Baroni, Johannes Matiasek, Harald Trost

claim paper

Read More »

171

Voted

COLING
2002

96views Computational Linguistics» more COLING 2002»

Investigating the Relationship between Word Segmentation Performance and Retrieval Performance in Chinese IR

15 years 6 months ago

Download acl.ldc.upenn.edu

It is commonly believed that word segmentation accuracy is monotonically related to retrieval performance in Chinese information retrieval. In this paper we show that, for Chinese...

Fuchun Peng, Xiangji Huang, Dale Schuurmans, Nick ...

claim paper

Read More »

168

Voted

ITA
2002

99views Communications» more ITA 2002»

Density of Critical Factorizations

15 years 6 months ago

Download www.fmi.uni-stuttgart.de

Abstract. We investigate the density of critical factorizations of infinte sequences of words. The density of critical factorizations of a word is the ratio between the number of p...

Tero Harju, Dirk Nowotka

claim paper

Read More »

159

click to vote

COLING
2002

108views Computational Linguistics» more COLING 2002»

Unsupervised Word Sense Disambiguation Using Bilingual Comparable Corpora

15 years 6 months ago

Download acl.ldc.upenn.edu

An unsupervised method for word sense disambiguation using a bilingual comparable corpus was developed. First, it extracts statistically significant pairs of related words from th...

Hiroyuki Kaji, Yasutsugu Morimoto

claim paper

Read More »

196

click to vote

COLING
2002

156views Computational Linguistics» more COLING 2002»

Unknown Word Extraction for Chinese Documents

15 years 6 months ago

Download acl.ldc.upenn.edu

There is no blank to mark word boundaries in Chinese text. As a result, identifying words is difficult, because of segmentation ambiguities and occurrences of unknown words. Conve...

Keh-Jiann Chen, Wei-Yun Ma

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers