Sciweavers

371 search results - page 11 / 75
» Learning to match and cluster large high-dimensional data se...
Sort
View
SDM
2007
SIAM
118views Data Mining» more  SDM 2007»
13 years 9 months ago
On Privacy-Preservation of Text and Sparse Binary Data with Sketches
In recent years, privacy preserving data mining has become very important because of the proliferation of large amounts of data on the internet. Many data sets are inherently high...
Charu C. Aggarwal, Philip S. Yu
ICML
2004
IEEE
14 years 8 months ago
Automated hierarchical mixtures of probabilistic principal component analyzers
Many clustering algorithms fail when dealing with high dimensional data. Principal component analysis (PCA) is a popular dimensionality reduction algorithm. However, it assumes a ...
Ting Su, Jennifer G. Dy
ML
2006
ACM
13 years 7 months ago
A Unified View on Clustering Binary Data
Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This p...
Tao Li
SDM
2009
SIAM
225views Data Mining» more  SDM 2009»
14 years 4 months ago
Integrated KL (K-means - Laplacian) Clustering: A New Clustering Approach by Combining Attribute Data and Pairwise Relations.
Most datasets in real applications come in from multiple sources. As a result, we often have attributes information about data objects and various pairwise relations (similarity) ...
Fei Wang, Chris H. Q. Ding, Tao Li
SDM
2008
SIAM
256views Data Mining» more  SDM 2008»
13 years 9 months ago
Graph Mining with Variational Dirichlet Process Mixture Models
Graph data such as chemical compounds and XML documents are getting more common in many application domains. A main difficulty of graph data processing lies in the intrinsic high ...
Koji Tsuda, Kenichi Kurihara