Sciweavers

138 search results - page 14 / 28
» Approximated Clustering of Distributed High-Dimensional Data
Sort
View
APPT
2005
Springer
14 years 1 months ago
Principal Component Analysis for Distributed Data Sets with Updating
Identifying the patterns of large data sets is a key requirement in data mining. A powerful technique for this purpose is the principal component analysis (PCA). PCA-based clusteri...
Zheng-Jian Bai, Raymond H. Chan, Franklin T. Luk
KDD
2009
ACM
191views Data Mining» more  KDD 2009»
14 years 8 months ago
Efficient methods for topic model inference on streaming document collections
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...
Limin Yao, David M. Mimno, Andrew McCallum
CVPR
2005
IEEE
14 years 9 months ago
A Bayesian Approach to Unsupervised Feature Selection and Density Estimation Using Expectation Propagation
We propose an approximate Bayesian approach for unsupervised feature selection and density estimation, where the importance of the features for clustering is used as the measure f...
Shaorong Chang, Nilanjan Dasgupta, Lawrence Carin
KAIS
2006
126views more  KAIS 2006»
13 years 7 months ago
Fast and exact out-of-core and distributed k-means clustering
Clustering has been one of the most widely studied topics in data mining and k-means clustering has been one of the popular clustering algorithms. K-means requires several passes ...
Ruoming Jin, Anjan Goswami, Gagan Agrawal
INTERNET
2006
157views more  INTERNET 2006»
13 years 7 months ago
Distributed Data Mining in Peer-to-Peer Networks
Distributed data mining deals with the problem of data analysis in environments with distributed data, computing nodes, and users. Peer-to-peer computing is emerging as a new dist...
Souptik Datta, Kanishka Bhaduri, Chris Giannella, ...