SMOTE: Synthetic Minority Over-sampling Technique

15 years 5 months ago

Download www.jair.org

An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of "normal" examples with only a small percentage of "abnormal" or "interesting" examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space) than only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the mino...

Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hal

Real-time Traffic

Classifier Performance | JAIR 2002 | Majority Class | Minority Class |

claim paper

» Selecting Minority Examples from Misclassified Data for OverSampling

Post Info
More Details (n/a)

Added	22 Dec 2010
Updated	22 Dec 2010
Type	Journal
Year	2002
Where	JAIR
Authors	Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, W. Philip Kegelmeyer

Comments (0)

Sciweavers

SMOTE: Synthetic Minority Over-sampling Technique

Classifier Performance | JAIR 2002 | Majority Class | Minority Class |

Explore & Download

Productivity Tools

Sciweavers