A Hybrid Re-sampling Method for SVM Learning from Imbalanced Data Sets

14 years 5 months ago

Download sci2s.ugr.es

Support Vector Machine (SVM) has been widely studied and shown success in many application fields. However, the performance of SVM drops significantly when it is applied to the problem of learning from imbalanced data sets in which negative instances greatly outnumber the positive instances. This paper analyzes the intrinsic factors behind this failure and proposes a suitable re-sampling method. We re-sample the imbalance data by using variable SOM clustering so as to overcome the flaws of the traditional re-sampling methods, such as serious randomness, subjective interference and information loss. Then we prune the training set by means of K-NN rule to solve the problem of data confusion, which improves the generalization ability of SVM. Experiment results show that our method obviously improves the performance of the SVM on imbalanced data sets.

Peng Li, Pei-Li Qiao, Yuan-Chao Liu

Real-time Traffic

FSKD 2008 | Fuzzy Logic | Imbalanced Data Sets | Re-sampling Methods | Support Vector Machine |

claim paper

Post Info
More Details (n/a)

Added	09 Nov 2010
Updated	09 Nov 2010
Type	Conference
Year	2008
Where	FSKD
Authors	Peng Li, Pei-Li Qiao, Yuan-Chao Liu

Comments (0)

Sciweavers

A Hybrid Re-sampling Method for SVM Learning from Imbalanced Data Sets

FSKD 2008 | Fuzzy Logic | Imbalanced Data Sets | Re-sampling Methods | Support Vector Machine |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers