Search Sciweavers | Sciweavers

109 search results - page 5 / 22

» Policy teaching through reward function learning

199

click to vote

CORR
2011
Springer

161views Education» more CORR 2011»

Doubly Robust Policy Evaluation and Learning

14 years 10 months ago

Download www.icml-2011.org

We study decision making in environments where the reward is only partially observed, but can be modeled as a function of an action and an observed context. This setting, known as...

Miroslav Dudík, John Langford, Lihong Li

claim paper

Read More »

166

click to vote

RAS
2010

131views more RAS 2010»

Probabilistic Policy Reuse for inter-task transfer learning

15 years 5 months ago

Download scalab.uc3m.es

Policy Reuse is a reinforcement learning technique that eﬃciently learns a new policy by using past similar learned policies. The Policy Reuse learner improves its exploration b...

Fernando Fernández, Javier García, M...

claim paper

Read More »

169

click to vote

ISCAS
2006
IEEE

103views Hardware» more ISCAS 2006»

Towards autonomous adaptive behavior in a bio-inspired CNN-controlled robot

16 years 21 days ago

Download web.mit.edu

— This paper describes a general approach for the unsupervised learning of behaviors in a behavior-based robot. The key idea is to formalize a behavior produced by a Motor Map dr...

Paolo Arena, Luigi Fortuna, Mattia Frasca, Luca Pa...

claim paper

Read More »

152

click to vote

ICML
2008
IEEE

147views Machine Learning» more ICML 2008»

Apprenticeship learning using linear programming

16 years 7 months ago

Download www.cs.ualberta.ca

In apprenticeship learning, the goal is to learn a policy in a Markov decision process that is at least as good as a policy demonstrated by an expert. The difficulty arises in tha...

Umar Syed, Michael H. Bowling, Robert E. Schapire

claim paper

Read More »

166

click to vote

NIPS
2007

143views Information Technology» more NIPS 2007»

A Game-Theoretic Approach to Apprenticeship Learning

15 years 8 months ago

Download books.nips.cc

We study the problem of an apprentice learning to behave in an environment with an unknown reward function by observing the behavior of an expert. We follow on the work of Abbeel ...

Umar Syed, Robert E. Schapire

claim paper

Read More »

« Prev « First page 5 / 22 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers