A Distance-Based Over-Sampling Method for Learning from Imbalanced Data Sets

14 years 2 months ago

Download www.aaai.org

Many real-world domains present the problem of imbalanced data sets, where examples of one classes signiﬁcantly outnumber examples of other classes. This makes learning difﬁcult, as learning algorithms based on optimizing accuracy over all training examples will tend to classify all examples as belonging to the majority class. We introduce a method to deal with this problem by means of creating a balanced data set, which allows to improve the performance of classiﬁers. Our method over-samples the minority class, using a randomized weighted distance scheme to generate synthetic examples in the neighborhood of each minority example.

Jorge de la Calleja, Olac Fuentes

Real-time Traffic

Artificial Intelligence | FLAIRS 2007 | Imbalanced Data Sets | Many Real-world Domains | Training Examples |

claim paper

Post Info
More Details (n/a)

Added	02 Oct 2010
Updated	02 Oct 2010
Type	Conference
Year	2007
Where	FLAIRS
Authors	Jorge de la Calleja, Olac Fuentes

Comments (0)

Sciweavers

A Distance-Based Over-Sampling Method for Learning from Imbalanced Data Sets

Artificial Intelligence | FLAIRS 2007 | Imbalanced Data Sets | Many Real-world Domains | Training Examples |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers