Sciweavers

TDP
2010

Communication-Efficient Privacy-Preserving Clustering

13 years 7 months ago
Communication-Efficient Privacy-Preserving Clustering
The ability to store vast quantities of data and the emergence of high speed networking have led to intense interest in distributed data mining. However, privacy concerns, as well as regulations, often prevent the sharing of data between multiple parties. Privacy-preserving distributed data mining allows the cooperative computation of data mining algorithms without requiring the participating organizations to reveal their individual data items to each other. This paper makes several contributions. First, we present a simple, deterministic, I/O-efficient kclustering algorithm that was designed with the goal of enabling an efficient privacy-preserving version of the algorithm. Our algorithm examines each item in the database only once and uses only sequential access to the data. Our experiments show that this algorithm produces cluster centers that are, on average, more accurate than the ones produced by the well known iterative k-means algorithm, and compares well against BIRCH. Second,...
Geetha Jagannathan, Krishnan Pillaipakkamnatt, Reb
Added 21 May 2011
Updated 21 May 2011
Type Journal
Year 2010
Where TDP
Authors Geetha Jagannathan, Krishnan Pillaipakkamnatt, Rebecca N. Wright, Daryl Umano
Comments (0)