A practical method for creating a high dimensional index structure that adapts to the data distribution and scales well with the database size, is presented. Typical media descrip...
Peng Wu, B. S. Manjunath, Shivkumar Chandrasekaran
The purpose of text clustering in information retrieval is to discover groups of semantically related documents. Accurate and comprehensible cluster descriptions (labels) let the ...
We consider a challenging clustering task: the clustering of muti-word terms without document co-occurrence information in order to form coherent groups of topics. For this task, ...
This paper describes a clustering algorithm for vector quantizers using a "stochastic association model". It offers a new simple and powerful softmax adaptation rule. Th...
The major limitation in bilingual latent semantic analysis (bLSA) is the requirement of parallel training corpora. Motivated by semi-supervised learning, we propose a clusterbased...