Sciweavers

3280 search results - page 513 / 656
» MiTAP for real users, real data, real problems
Sort
View
ICML
2006
IEEE
14 years 10 months ago
Discriminative cluster analysis
Clustering is one of the most widely used statistical tools for data analysis. Among all existing clustering techniques, k-means is a very popular method because of its ease of pr...
Fernando De la Torre, Takeo Kanade
WWW
2008
ACM
14 years 10 months ago
Efficient similarity joins for near duplicate detection
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
KDD
2007
ACM
184views Data Mining» more  KDD 2007»
14 years 9 months ago
GraphScope: parameter-free mining of large time-evolving graphs
How can we find communities in dynamic networks of social interactions, such as who calls whom, who emails whom, or who sells to whom? How can we spot discontinuity timepoints in ...
Jimeng Sun, Christos Faloutsos, Spiros Papadimitri...
KDD
2007
ACM
148views Data Mining» more  KDD 2007»
14 years 9 months ago
Scalable look-ahead linear regression trees
Most decision tree algorithms base their splitting decisions on a piecewise constant model. Often these splitting algorithms are extrapolated to trees with non-constant models at ...
David S. Vogel, Ognian Asparouhov, Tobias Scheffer
KDD
2006
ACM
147views Data Mining» more  KDD 2006»
14 years 9 months ago
Summarizing itemset patterns using probabilistic models
In this paper, we propose a novel probabilistic approach to summarize frequent itemset patterns. Such techniques are useful for summarization, post-processing, and end-user interp...
Chao Wang, Srinivasan Parthasarathy