Sciweavers

689 search results - page 52 / 138
» Urdu Word Segmentation
Sort
View
WWW
2011
ACM
13 years 3 months ago
Unsupervised query segmentation using only query logs
We introduce an unsupervised query segmentation scheme that uses query logs as the only resource and can effectively capture the structural units in queries. We believe that Web s...
Nikita Mishra, Rishiraj Saha Roy, Niloy Ganguly, S...
LREC
2010
188views Education» more  LREC 2010»
13 years 10 months ago
How Large a Corpus Do We Need: Statistical Method Versus Rule-based Method
We investigate the impact of input data scale in corpus-based learning using a study style of Zipf's law. In our research, Chinese word segmentation is chosen as the study ca...
Hai Zhao, Yan Song, Chunyu Kit
INFORMATICALT
2006
109views more  INFORMATICALT 2006»
13 years 8 months ago
Discrimination of Homographs Distorted by a Lengthy Impulsive Noise
Abstract. The paper addresses the problem of discrimination of homographs when a lengthy segment of an uttered word is missing. The considered discrimination procedure is done by r...
Sarunas Paulikas, Dalius Navakauskas
COLING
1992
13 years 9 months ago
Tokenization As The Initial Phase In NLP
In this paper, the authors address the significance and complexityof tokenization, the beginning step of NLP. Notions of word and token are discussed and defined from the viewpoin...
Jonathan J. Webster, Chunyu Kit
ICASSP
2010
IEEE
13 years 8 months ago
From flat direct models to segmental CRF models
This paper summarizes recent work at Microsoft on the development of novel direct models. The key characteristic of our approaches is the use of long-span segment level features t...
Geoffrey Zweig, Patrick Nguyen