Sciweavers

2497 search results - page 340 / 500
» A Partial-Repeatability Approach to Data Mining
Sort
View
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
16 years 4 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar
KDD
2002
ACM
148views Data Mining» more  KDD 2002»
16 years 4 months ago
Tumor cell identification using features rules
Advances in imaging techniques have led to large repositories of images. There is an increasing demand for automated systems that can analyze complex medical images and extract me...
Bin Fang, Wynne Hsu, Mong-Li Lee
KDD
2008
ACM
174views Data Mining» more  KDD 2008»
16 years 4 months ago
Automatic identification of quasi-experimental designs for discovering causal knowledge
Researchers in the social and behavioral sciences routinely rely on quasi-experimental designs to discover knowledge from large databases. Quasi-experimental designs (QEDs) exploi...
David D. Jensen, Andrew S. Fast, Brian J. Taylor, ...
KDD
2005
ACM
127views Data Mining» more  KDD 2005»
16 years 4 months ago
Detection of emerging space-time clusters
We propose a new class of spatio-temporal cluster detection methods designed for the rapid detection of emerging space-time clusters. We focus on the motivating application of pro...
Daniel B. Neill, Andrew W. Moore, Maheshkumar Sabh...
KDD
2004
ACM
154views Data Mining» more  KDD 2004»
16 years 4 months ago
Diagnosing extrapolation: tree-based density estimation
There has historically been very little concern with extrapolation in Machine Learning, yet extrapolation can be critical to diagnose. Predictor functions are almost always learne...
Giles Hooker