Selecting Near-Optimal Learners via Incremental Data Allocation

8 years 9 months ago

Download allenai.org

We study a novel machine learning (ML) problem setting of sequentially allocating small subsets of training data amongst a large set of classiﬁers. The goal is to select a classiﬁer that will give near-optimal accuracy when trained on all data, while also minimizing the cost of misallocated samples. This is motivated by large modern datasets and ML toolkits with many combinations of learning algorithms and hyperparameters. Inspired by the principle of “optimism under uncertainty,” we propose an innovative strategy, Data Allocation using Upper Bounds (DAUB), which robustly achieves these objectives across a variety of real-world datasets. We further develop substantial theoretical support for DAUB in an idealized setting where the expected accuracy of a classiﬁer trained on n samples can be known exactly. Under these conditions we establish a rigorous sub-linear bound on the regret of the approach (in terms of misallocated data), as well as a rigorous bound on suboptimality o...

Ashish Sabharwal, Horst Samulowitz, Gerald Tesauro

Real-time Traffic

CORR 2016 | Education |

claim paper

Post Info
More Details (n/a)

Added	01 Apr 2016
Updated	01 Apr 2016
Type	Journal
Year	2016
Where	CORR
Authors	Ashish Sabharwal, Horst Samulowitz, Gerald Tesauro

Comments (0)

Sciweavers

Selecting Near-Optimal Learners via Incremental Data Allocation

CORR 2016 | Education |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers