A Comparison of Two Approaches to Data Mining from Imbalanced Data

15 years 12 months ago

Download sci2s.ugr.es

Our objective is a comparison of two data mining approaches to dealing with imbalanced data sets. The ﬁrst approach is based on saving the original rule set, induced by the LEM2 algorithm, and changing the rule strength for all rules for the smaller class (concept) during classiﬁcation. In the second approach, rule induction was split: the rule set for the larger class was induced by LEM2, while the rule set for the smaller class was induced by EXPLORE, another data mining algorithm. Results of our experiments show that both approaches increase the sensitivity compared to the original LEM2. However, the diﬀerence in performance of both approaches is statistically insigniﬁcant. Thus the appropriate approach to dealing with imbalanced data sets should be selected individually for a speciﬁc data set.

Jerzy W. Grzymala-Busse, Jerzy Stefanowski, Szymon

Real-time Traffic

Data Mining | Imbalanced Data Sets | KES 2004 | Smaller Class |

claim paper

» Selective Preprocessing of Imbalanced Data for Improving Classification Performance

» An Imbalanced Data Rule Learner

» BorderlineSMOTE A New OverSampling Method in Imbalanced Data Sets Learning

» A Comparison of Two Document Clustering Approaches for Clustering Medical Documents

» Building Useful Models from Imbalanced Data with Sampling and Boosting

» Improving SVM Classification on Imbalanced Data Sets in Distance Spaces

» A study in machine learning from imbalanced data for sentence boundary detection in speech

» Roughly Balanced Bagging for Imbalanced Data

Post Info
More Details (n/a)

Added	02 Jul 2010
Updated	02 Jul 2010
Type	Conference
Year	2004
Where	KES
Authors	Jerzy W. Grzymala-Busse, Jerzy Stefanowski, Szymon Wilk

Comments (0)

Sciweavers

A Comparison of Two Approaches to Data Mining from Imbalanced Data

Data Mining | Imbalanced Data Sets | KES 2004 | Smaller Class |

Explore & Download

Productivity Tools

Sciweavers