We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
For a wide variety of classification algorithms, scalability to large databases can be achieved by observing that most algorithms are driven by a set of sufficient statistics that...
We present a novel sequential clustering algorithm which is motivated by the Information Bottleneck (IB) method. In contrast to the agglomerative IB algorithm, the new sequential ...
People take more and more photos at different time and different events, however, these photos are often put into one giant folder and they are seldom annotated or organized. As t...
In this paper, we present a new method for video genre identification based on the linguistic content analysis. This approach relies on the analysis of the most frequent words in...