Sciweavers

537 search results - page 84 / 108
» A Rough Set Approach to Attribute Generalization in Data Min...
Sort
View
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
14 years 7 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar
KDD
2010
ACM
245views Data Mining» more  KDD 2010»
13 years 10 months ago
Flexible constrained spectral clustering
Constrained clustering has been well-studied for algorithms like K-means and hierarchical agglomerative clustering. However, how to encode constraints into spectral clustering rem...
Xiang Wang, Ian Davidson
GECCO
2008
Springer
174views Optimization» more  GECCO 2008»
13 years 8 months ago
Mask functions for the symbolic modeling of epistasis using genetic programming
The study of common, complex multifactorial diseases in genetic epidemiology is complicated by nonlinearity in the genotype-to-phenotype mapping relationship that is due, in part,...
Ryan J. Urbanowicz, Nate Barney, Bill C. White, Ja...
SDM
2009
SIAM
123views Data Mining» more  SDM 2009»
14 years 4 months ago
Measuring Discrimination in Socially-Sensitive Decision Records.
Discrimination in social sense (e.g., against minorities and disadvantaged groups) is the subject of many laws worldwide, and it has been extensively studied in the social and eco...
Dino Pedreschi, Franco Turini, Salvatore Ruggieri
KDD
1997
ACM
130views Data Mining» more  KDD 1997»
13 years 11 months ago
Process-Based Database Support for the Early Indicator Method
In Wirth t Reinartz (1996), we introduced the early indicator method, a multi-strategy approach for the efficient prediction of various aspectsof the fault profile of a set of car...
Christoph Breitner, Jörg Schlösser, R&uu...