Search Sciweavers | Sciweavers

25

ICML
2006
IEEE

103views Machine Learning» more ICML 2006»

Using inaccurate models in reinforcement learning

14 years 9 months ago

In the model-based policy search approach to reinforcement learning (RL), policies are found using a model (or "simulator") of the Markov decision process. However, for ...

Pieter Abbeel, Morgan Quigley, Andrew Y. Ng

claim paper

Read More »

32

click to vote

ICML
2006
IEEE

147views Machine Learning» more ICML 2006»

A choice model with infinitely many latent features

14 years 9 months ago

Download www.gatsby.ucl.ac.uk

Elimination by aspects (EBA) is a probabilistic choice model describing how humans decide between several options. The options from which the choice is made are characterized by b...

Carl Edward Rasmussen, Dilan Görür, Fran...

claim paper

Read More »

25

click to vote

ICML
2006
IEEE

136views Machine Learning» more ICML 2006»

An analytic solution to discrete Bayesian reinforcement learning

14 years 9 months ago

Download www.cs.uwaterloo.ca

Reinforcement learning (RL) was originally proposed as a framework to allow agents to learn in an online fashion as they interact with their environment. Existing RL algorithms co...

Pascal Poupart, Nikos A. Vlassis, Jesse Hoey, Kevi...

claim paper

Read More »

30

click to vote

ICML
2004
IEEE

214views Machine Learning» more ICML 2004»

Apprenticeship learning via inverse reinforcement learning

14 years 9 months ago

Download ai.stanford.edu

We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we wa...

Pieter Abbeel, Andrew Y. Ng

claim paper

Read More »

24

click to vote

ICML
2003
IEEE

121views Machine Learning» more ICML 2003»

Planning in the Presence of Cost Functions Controlled by an Adversary

14 years 9 months ago

Download www.cs.cmu.edu

We investigate methods for planning in a Markov Decision Process where the cost function is chosen by an adversary after we fix our policy. As a running example, we consider a rob...

H. Brendan McMahan, Geoffrey J. Gordon, Avrim Blum

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers