Large-scale logistic regression arises in many applications such as document classification and natural language processing. In this paper, we apply a trust region Newton method t...
The classical (ad hoc) document retrieval problem has been traditionally approached through ranking according to heuristically developed functions (such as tf.idf or bm25) or gene...
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Background: One step in the model organism database curation process is to find, for each article, the identifier of every gene discussed in the article. We consider a relaxation ...
Semi-supervised clustering algorithms aim to improve clustering results using limited supervision. The supervision is generally given as pairwise constraints; such constraints are...
Brian Kulis, Sugato Basu, Inderjit S. Dhillon, Ray...