Search Sciweavers | Sciweavers

81 search results - page 12 / 17

» The Optimal Reward Baseline for Gradient-Based Reinforcement...

208

click to vote

AAAI
2007

104views Intelligent Agents» more AAAI 2007»

Active Imitation Learning

15 years 9 months ago

Download www.cs.washington.edu

Imitation learning, also called learning by watching or programming by demonstration, has emerged as a means of accelerating many reinforcement learning tasks. Previous work has s...

Aaron P. Shon, Deepak Verma, Rajesh P. N. Rao

claim paper

Read More »

287

Voted

ECCV
2010
Springer

251views Computer Vision» more ECCV 2010»

Discriminative Tracking by Metric Learning

15 years 11 months ago

Download www.eecs.northwestern.edu

We present a discriminative model that casts appearance modeling and visual matching into a single objective for visual tracking. Most previous discriminative models for visual tra...

claim paper

Read More »

223

click to vote

ATAL
2004
Springer

102views Intelligent Agents» more ATAL 2004»

A Pheromone-Based Utility Model for Collaborative Foraging

16 years 26 days ago

Download cs.gmu.edu

Multi-agent research often borrows from biology, where remarkable examples of collective intelligence may be found. One interesting example is ant colonies’ use of pheromones as...

Liviu Panait, Sean Luke

claim paper

Read More »

225

Voted

ACL
2008

127views Computational Linguistics» more ACL 2008»

Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz Data: Bootstrapping and Evaluation

15 years 9 months ago

Download www.aclweb.org

We address two problems in the field of automatic optimization of dialogue strategies: learning effective dialogue strategies when no initial data or system exists, and evaluating...

Verena Rieser, Oliver Lemon

claim paper

Read More »

201

click to vote

COGSR
2011

71views more COGSR 2011»

Psychological models of human and optimal performance in bandit problems

15 years 2 months ago

Download www.socsci.uci.edu

In bandit problems, a decision-maker must choose between a set of alternatives, each of which has a ﬁxed but unknown rate of reward, to maximize their total number of rewards ov...

Michael D. Lee, Shunan Zhang, Miles Munro, Mark St...

claim paper

Read More »

« Prev « First page 12 / 17 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers