Search Sciweavers | Sciweavers

700 search results - page 28 / 140

» Language Model Based Arabic Word Segmentation

180

click to vote

CORR
1998
Springer

96views Education» more CORR 1998»

15 years 6 months ago

Similarity-Based Models of Word Cooccurrence Probabilities

Download www.cis.upenn.edu

Abstract. In many applications of natural language processing (NLP) it is necessary to determine the likelihood of a given word combination. For example, a speech recognizer may ne...

Ido Dagan, Lillian Lee, Fernando C. N. Pereira

claim paper

Read More »

208

click to vote

ACL
2012

199views Computational Linguistics» more ACL 2012»

Unsupervized Word Segmentation: the Case for Mandarin Chinese

13 years 9 months ago

Download aclweb.org

In this paper, we present an unsupervized segmentation system tested on Mandarin Chinese. Following Harris's Hypothesis in Kempe (1999) and Tanaka-Ishii's (2005) reformu...

Pierre Magistry, Benoît Sagot

claim paper

Read More »

175

click to vote

IJCNLP
2004
Springer

117views Natural Language Processing» more IJCNLP 2004»

The Use of SVM for Chinese New Word Identification

16 years 17 days ago

Download research.microsoft.com

We present a study of new word identification (NWI) to improve the performance of a Chinese word segmenter. In this paper the distribution and types of new words are discussed emp...

Hongqiao Li, Changning Huang, Jianfeng Gao, Xiaozh...

claim paper

Read More »

255

click to vote

EMNLP
2008

234views Natural Language Processing» more EMNLP 2008»

Scalable Language Processing Algorithms for the Masses: A Case Study in Computing Word Co-occurrence Matrices with MapReduce

15 years 8 months ago

Download www.umiacs.umd.edu

This paper explores the challenge of scaling up language processing algorithms to increasingly large datasets. While cluster computing has been available in commercial environment...

Jimmy J. Lin

claim paper

Read More »

175

click to vote

DAS
2006
Springer

119views Document Analysis» more DAS 2006»

Language Identification in Degraded and Distorted Document Images

15 years 11 months ago

Download www.comp.nus.edu.sg

This paper presents a language identification technique that differentiates Latin-based languages in degraded and distorted document images. Different from the reported methods tha...

Shijian Lu, Chew Lim Tan, Weihua Huang

claim paper

Read More »

« Prev « First page 28 / 140 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers