Sciweavers

ICDCS
2006
IEEE

ParRescue: Scalable Parallel Algorithm and Implementation for Biclustering over Large Distributed Datasets

14 years 5 months ago
ParRescue: Scalable Parallel Algorithm and Implementation for Biclustering over Large Distributed Datasets
Biclustering refers to simultaneously capturing correlations present among subsets of attributes (columns) and records (rows). It is widely used in data mining applications including biological data analysis, financial forecasting, and text mining. Biclustering algorithms are significantly more complex compared to the classical one dimensional clustering techniques, particularly those requiring multiple computing platforms for large and distributed data sets. In this paper, we develop an efficient scalable algorithm, referred to as ParRescue(Parallel Residue Co-clustering), that is capable of performing biclustering on extremely large or geographically distributed data sets. ParRescue divides the cluster tasks among processors with minimal communication costs thus making it scalable over large number of computing nodes. The proposed implementation is based on an existing sequential approach that has been modified for amenable parallel implementation. The proposed ParRescue algorit...
Jianhong Zhou, Ashfaq A. Khokhar
Added 11 Jun 2010
Updated 11 Jun 2010
Type Conference
Year 2006
Where ICDCS
Authors Jianhong Zhou, Ashfaq A. Khokhar
Comments (0)