GRID environments are privileged targets for computation-intensive problem solving in areas from weather forecasting to seismic analysis. Mainly composed by commodity hardware, th...
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
In this paper we propose to study budget semi-supervised learning, i.e., semi-supervised learning with a resource budget, such as a limited memory insufficient to accommodate and/...
Zhi-Hua Zhou, Michael Ng, Qiao-Qiao She, Yuan Jian...
We describe a fast algorithm for kernel discriminant analysis, empirically demonstrating asymptotic speed-up over the previous best approach. We achieve this with a new pattern of...
We present a document routing and index partitioning scheme for scalable similarity-based search of documents in a large corpus. We consider the case when similarity-based search ...