This paper proposes a new approach for classifying text documents into two disjoint classes. The new approach is based on extracting patterns, in the form of two logical expressio...
Text classification using a small labeled set and a large unlabeled data is seen as a promising technique to reduce the labor-intensive and time consuming effort of labeling traini...
High dimensional directional data is becoming increasingly important in contemporary applications such as analysis of text and gene-expression data. A natural model for multivaria...
Arindam Banerjee, Inderjit S. Dhillon, Joydeep Gho...
In spite of the initialization problem, the ExpectationMaximization (EM) algorithm is widely used for estimating the parameters in several data mining related tasks. Most popular ...
Chandan K. Reddy, Hsiao-Dong Chiang, Bala Rajaratn...
Probabilistic topic models have become popular as methods for dimensionality reduction in collections of text documents or images. These models are usually treated as generative m...