Sciweavers

EMNLP
2009
13 years 10 months ago
Less is More: Significance-Based N-gram Selection for Smaller, Better Language Models
The recent availability of large corpora for training N-gram language models has shown the utility of models of higher order than just trigrams. In this paper, we investigate meth...
Robert C. Moore, Chris Quirk
LREC
2008
114views Education» more  LREC 2008»
14 years 1 months ago
Improving Statistical Machine Translation Efficiency by Triangulation
In current phrase-based Statistical Machine Translation systems, more training data is generally better than less. However, a larger data set eventually introduces a larger model ...
Yu Chen, Andreas Eisele, Martin Kay
ICPR
2008
IEEE
14 years 6 months ago
Effective shrinkage of large multi-class linear svm models for text categorization
When linear support vector machines (SVMs) are applied to multi-class text categorization in industry, the size of the linear SVM model is very large, usually greater than several...
Jian-xiong Dong, Ching Y. Suen, Adam Krzyzak