Clustering in data mining is a discovery process that groups a set of data such that the intracluster similarity is maximized and the intercluster similarity is minimized. These d...
Eui-Hong Han, George Karypis, Vipin Kumar, Bamshad...
The importance of gene expression data in cancer diagnosis and treatment by now has been widely recognized by cancer researchers in recent years. However, one of the major challen...
Rui Xu, Steven Damelin, Boaz Nadler, Donald C. Wun...
The crucial issue in many classification applications is how to achieve the best possible classifier with a limited number of labeled data for training. Training data selection is ...
In this paper, we propose GAD (General Activity Detection) for fast clustering on large scale data. Within this framework we design a set of algorithms for different scenarios: (...
Jiawei Han, Liangliang Cao, Sangkyum Kim, Xin Jin,...
This paper introduces the CondorJ2 cluster management system. Traditionally, cluster management systems such as Condor employ a process-oriented approach with little or no use of ...