Sciweavers

2715 search results - page 502 / 543
» Database Publication Practices
Sort
View
SIGSOFT
2007
ACM
14 years 9 months ago
Training on errors experiment to detect fault-prone software modules by spam filter
The fault-prone module detection in source code is of importance for assurance of software quality. Most of previous fault-prone detection approaches are based on software metrics...
Osamu Mizuno, Tohru Kikuno
KDD
2009
ACM
215views Data Mining» more  KDD 2009»
14 years 9 months ago
Large-scale sparse logistic regression
Logistic Regression is a well-known classification method that has been used widely in many applications of data mining, machine learning, computer vision, and bioinformatics. Spa...
Jun Liu, Jianhui Chen, Jieping Ye
KDD
2009
ACM
159views Data Mining» more  KDD 2009»
14 years 9 months ago
Adapting the right measures for K-means clustering
Clustering validation is a long standing challenge in the clustering literature. While many validation measures have been developed for evaluating the performance of clustering al...
Junjie Wu, Hui Xiong, Jian Chen
KDD
2008
ACM
176views Data Mining» more  KDD 2008»
14 years 9 months ago
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface
Matching records that refer to the same entity across databases is becoming an increasingly important part of many data mining projects, as often data from multiple sources needs ...
Peter Christen
KDD
2008
ACM
119views Data Mining» more  KDD 2008»
14 years 9 months ago
SAIL: summation-based incremental learning for information-theoretic clustering
Information-theoretic clustering aims to exploit information theoretic measures as the clustering criteria. A common practice on this topic is so-called INFO-K-means, which perfor...
Junjie Wu, Hui Xiong, Jian Chen