Sciweavers

1085 search results - page 88 / 217
» Active Mining in a Distributed Setting
Sort
View
OSDI
2008
ACM
14 years 7 days ago
DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language
DryadLINQ is a system and a set of language extensions that enable a new programming model for large scale distributed computing. It generalizes previous execution environments su...
Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Bud...
ICDM
2010
IEEE
228views Data Mining» more  ICDM 2010»
13 years 8 months ago
Multi-label Feature Selection for Graph Classification
Nowadays, the classification of graph data has become an important and active research topic in the last decade, which has a wide variety of real world applications, e.g. drug acti...
Xiangnan Kong, Philip S. Yu
ICAI
2004
13 years 11 months ago
A Comparison of Resampling Methods for Clustering Ensembles
-- Combination of multiple clusterings is an important task in the area of unsupervised learning. Inspired by the success of supervised bagging algorithms, we propose a resampling ...
Behrouz Minaei-Bidgoli, Alexander P. Topchy, Willi...
KDD
2006
ACM
130views Data Mining» more  KDD 2006»
14 years 10 months ago
Efficient anonymity-preserving data collection
The output of a data mining algorithm is only as good as its inputs, and individuals are often unwilling to provide accurate data about sensitive topics such as medical history an...
Justin Brickell, Vitaly Shmatikov
EDBTW
2010
Springer
14 years 4 months ago
A practice-oriented framework for measuring privacy and utility in data sanitization systems
Published data is prone to privacy attacks. Sanitization methods aim to prevent these attacks while maintaining usefulness of the data for legitimate users. Quantifying the trade-...
Michal Sramka, Reihaneh Safavi-Naini, Jörg De...