An unsupervised classification algorithm is derived by modeling observed data as a mixture of several mutually exclusive classes that are each described by linear combinations of i...
The amount of available Thai broadcast news transcribed text for training a language model is still very limited, comparing to other major languages. Since the construction of a b...
This paper explores the use of hierarchical structure for classifying a large, heterogeneous collection of web content. The hierarchical structure is initially used to train diffe...
: This article describes a multilayer model-based approach for text compression. It uses linguistic information to develop a multilayer decomposition model of the text in order to ...
Naive Bayes is often used as a baseline in text classification because it is fast and easy to implement. Its severe assumptions make such efficiency possible but also adversely af...
Jason D. Rennie, Lawrence Shih, Jaime Teevan, Davi...