Sciweavers

1552 search results - page 253 / 311
» Mining for Patterns in Contradictory Data
Sort
View
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
16 years 4 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar
ICDM
2009
IEEE
145views Data Mining» more  ICDM 2009»
15 years 1 months ago
Significance of Episodes Based on Minimal Windows
Discovering episodes, frequent sets of events from a sequence has been an active field in pattern mining. Traditionally, a level-wise approach is used to discover all frequent epis...
Nikolaj Tatti
122
Voted
KDD
2006
ACM
118views Data Mining» more  KDD 2006»
16 years 4 months ago
Reducing the human overhead in text categorization
Many applications in text processing require significant human effort for either labeling large document collections (when learning statistical models) or extrapolating rules from...
Arnd Christian König, Eric Brill
114
Voted
KDD
2006
ACM
163views Data Mining» more  KDD 2006»
16 years 4 months ago
New EM derived from Kullback-Leibler divergence
We introduce a new EM framework in which it is possible not only to optimize the model parameters but also the number of model components. A key feature of our approach is that we...
Longin Jan Latecki, Marc Sobel, Rolf Lakämper
KDD
2004
ACM
190views Data Mining» more  KDD 2004»
16 years 4 months ago
Kernel k-means: spectral clustering and normalized cuts
Kernel k-means and spectral clustering have both been used to identify clusters that are non-linearly separable in input space. Despite significant research, these methods have re...
Inderjit S. Dhillon, Yuqiang Guan, Brian Kulis