Sciweavers

700 search results - page 28 / 140
» Language Model Based Arabic Word Segmentation
Sort
View
CORR
1998
Springer
96views Education» more  CORR 1998»
13 years 8 months ago
Similarity-Based Models of Word Cooccurrence Probabilities
Abstract. In many applications of natural language processing (NLP) it is necessary to determine the likelihood of a given word combination. For example, a speech recognizer may ne...
Ido Dagan, Lillian Lee, Fernando C. N. Pereira
ACL
2012
11 years 11 months ago
Unsupervized Word Segmentation: the Case for Mandarin Chinese
In this paper, we present an unsupervized segmentation system tested on Mandarin Chinese. Following Harris's Hypothesis in Kempe (1999) and Tanaka-Ishii's (2005) reformu...
Pierre Magistry, Benoît Sagot
IJCNLP
2004
Springer
14 years 1 months ago
The Use of SVM for Chinese New Word Identification
We present a study of new word identification (NWI) to improve the performance of a Chinese word segmenter. In this paper the distribution and types of new words are discussed emp...
Hongqiao Li, Changning Huang, Jianfeng Gao, Xiaozh...
EMNLP
2008
13 years 10 months ago
Scalable Language Processing Algorithms for the Masses: A Case Study in Computing Word Co-occurrence Matrices with MapReduce
This paper explores the challenge of scaling up language processing algorithms to increasingly large datasets. While cluster computing has been available in commercial environment...
Jimmy J. Lin
DAS
2006
Springer
14 years 4 days ago
Language Identification in Degraded and Distorted Document Images
This paper presents a language identification technique that differentiates Latin-based languages in degraded and distorted document images. Different from the reported methods tha...
Shijian Lu, Chew Lim Tan, Weihua Huang