Sciweavers

577 search results - page 5 / 116
» Improved Text Generation Using N-gram Statistics
Sort
View
WSDM
2009
ACM
172views Data Mining» more  WSDM 2009»
14 years 5 months ago
Clustering the tagged web
Automatically clustering web pages into semantic groups promises improved search and browsing on the web. In this paper, we demonstrate how user-generated tags from largescale soc...
Daniel Ramage, Paul Heymann, Christopher D. Mannin...
TKDE
2008
111views more  TKDE 2008»
13 years 10 months ago
Text Clustering with Feature Selection by Using Statistical Data
Abstract-- Feature selection is an important method for improving the efficiency and accuracy of text categorization algorithms by removing redundant and irrelevant terms from the ...
Yanjun Li, Congnan Luo, Soon M. Chung
EMNLP
2010
13 years 9 months ago
Enhancing Domain Portability of Chinese Segmentation Model Using Chi-Square Statistics and Bootstrapping
Almost all Chinese language processing tasks involve word segmentation of the language input as their first steps, thus robust and reliable segmentation techniques are always requ...
Baobao Chang, Dongxu Han
COLING
2010
13 years 5 months ago
Syntax Based Reordering with Automatically Derived Rules for Improved Statistical Machine Translation
Syntax based reordering has been shown to be an effective way of handling word order differences between source and target languages in Statistical Machine Translation (SMT) syste...
Karthik Visweswariah, Jiri Navratil, Jeffrey S. So...
SIGIR
1995
ACM
14 years 2 months ago
Noise Reduction in a Statistical Approach to Text Categorization
This paper studies noise reduction for computational efficiency improvements in a statistical learning method for text categorization, the Linear Least Squares Fit (LLSF) mapping...
Yiming Yang