Abstract. Most common feature selection techniques for document categorization are supervised and require lots of training data in order to accurately capture the descriptive and d...
Discovering and summarizing opinions from online reviews is an important and challenging task. A commonly-adopted framework generates structured review summaries with aspects and ...
Wayne Xin Zhao, Jing Jiang, Hongfei Yan, Xiaoming ...
In many applications, replacing a complex word form by its stem can reduce sparsity, revealing connections in the data that would not otherwise be apparent. In this paper, we focu...
Shane Bergsma, Aditya Bhargava, Hua He, Grzegorz K...
We describe a new scalable algorithm for semi-supervised training of conditional random fields (CRF) and its application to partof-speech (POS) tagging. The algorithm uses a simil...
We present a novel approach to distributionalonly, fully unsupervised, POS tagging, based on an adaptation of the EM algorithm for the estimation of a Gaussian mixture. In this ap...