Distributed Pasting of Small Votes

14 years 2 months ago

Download csmr.ca.sandia.gov

Bagging and boosting are two popular ensemble methods that achieve better accuracy than a single classifier. These techniques have limitations on massive datasets, as the size of the dataset can be a bottleneck. Voting many classifiers built on small subsets of data ("pasting small votes") is a promising approach for learning from massive datasets. Pasting small votes can utilize the power of boosting and bagging, and potentially scale up to massive datasets. We propose a framework for building hundreds or thousands of such classifiers on small subsets of data in a distributed environment. Experiments show this approach is fast, accurate, and scalable to massive datasets.

Nitesh V. Chawla, Lawrence O. Hall, Kevin W. Bowye

Real-time Traffic

Massive Datasets | MCS 2002 | Pattern Recognition | Popular Ensemble Methods | Small Votes |

claim paper

Post Info
More Details (n/a)

Added	22 Dec 2010
Updated	22 Dec 2010
Type	Journal
Year	2002
Where	MCS
Authors	Nitesh V. Chawla, Lawrence O. Hall, Kevin W. Bowyer, Thomas E. Moore, W. Philip Kegelmeyer

Comments (0)

Sciweavers

Distributed Pasting of Small Votes

Massive Datasets | MCS 2002 | Pattern Recognition | Popular Ensemble Methods | Small Votes |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers