Rival Penalized Competitive Learning (RPCL) and its variants can perform clustering analysis efficiently with the ability of selecting the cluster number automatically. Although t...
Tao Li, Wenjiang Pei, Shao-ping Wang, Yiu-ming Che...
HadoopDB is a hybrid of MapReduce and DBMS technologies, designed to meet the growing demand of analyzing massive datasets on very large clusters of machines. Our previous work ha...
This paper presents a novel Second Order Cone Programming (SOCP) formulation for large scale binary classification tasks. Assuming that the class conditional densities are mixture...
J. Saketha Nath, Chiranjib Bhattacharyya, M. Naras...
We propose and evaluate a probabilistic framework for estimating a Twitter user’s city-level location based purely on the content of the user’s tweets, even in the absence of ...
We examine the learning-curve sampling method, an approach for applying machinelearning algorithms to large data sets. The approach is based on the observation that the computatio...