Sciweavers

25 search results - page 4 / 5
» A compression-based algorithm for Chinese word segmentation
Sort
View
NLPRS
2001
Springer
13 years 11 months ago
Vietnamese Word Segmentation
Word segmentation is the first and obligatory task for every NLP. For inflectional languages like English, French, Dutch,.. their word boundaries are simply assumed to be whitespa...
Dinh Dien, Hoang Kiem, Nguyen Van Toan
EMNLP
2010
13 years 5 months ago
A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model
We show that the standard beam-search algorithm can be used as an efficient decoder for the global linear model of Zhang and Clark (2008) for joint word segmentation and POS-taggi...
Yue Zhang 0004, Stephen Clark
ACL
2008
13 years 8 months ago
Text Segmentation with LDA-Based Fisher Kernel
In this paper we propose a domainindependent text segmentation method, which consists of three components. Latent Dirichlet allocation (LDA) is employed to compute words semantic ...
Qi Sun, Runxin Li, Dingsheng Luo, Xihong Wu
ACL
1998
13 years 8 months ago
Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model
We present a novel OCR error correction method for languages without word delimiters that have a large character set, such as Japanese and Chinese. It consists of a statistical OC...
Masaaki Nagata
CORR
2002
Springer
90views Education» more  CORR 2002»
13 years 7 months ago
Mostly-Unsupervised Statistical Segmentation of Japanese Kanji Sequences
Given the lack of word delimiters in written Japanese, word segmentation is generally considered a crucial first step in processing Japanese texts. Typical Japanese segmentation a...
Rie Kubota Ando, Lillian Lee