Sciweavers

2497 search results - page 334 / 500
» A Partial-Repeatability Approach to Data Mining
Sort
View
SDM
2007
SIAM
74views Data Mining» more  SDM 2007»
15 years 5 months ago
HACS: Heuristic Algorithm for Clustering Subsets
The term consideration set is used in marketing to refer to the set of items a customer thought about purchasing before making a choice. While consideration sets are not directly ...
Ding Yuan, W. Nick Street
PKDD
2000
Springer
100views Data Mining» more  PKDD 2000»
15 years 7 months ago
Learning Right Sized Belief Networks by Means of a Hybrid Methodology
Previous algoritms for the construction of belief networks structures from data are mainly based either on independence criteria or on scoring metrics. The aim of this paper is to ...
Silvia Acid, Luis M. de Campos
SDM
2003
SIAM
125views Data Mining» more  SDM 2003»
15 years 5 months ago
Scalable, Balanced Model-based Clustering
This paper presents a general framework for adapting any generative (model-based) clustering algorithm to provide balanced solutions, i.e., clusters of comparable sizes. Partition...
Shi Zhong, Joydeep Ghosh
SIGSOFT
2007
ACM
16 years 4 months ago
Training on errors experiment to detect fault-prone software modules by spam filter
The fault-prone module detection in source code is of importance for assurance of software quality. Most of previous fault-prone detection approaches are based on software metrics...
Osamu Mizuno, Tohru Kikuno
SDM
2008
SIAM
177views Data Mining» more  SDM 2008»
15 years 5 months ago
Roughly Balanced Bagging for Imbalanced Data
Imbalanced class problems appear in many real applications of classification learning. We propose a novel sampling method to improve bagging for data sets with skewed class distri...
Shohei Hido, Hisashi Kashima