Sciweavers

779 search results - page 29 / 156
» Reinforcement Using Supervised Learning for Policy Generaliz...
Sort
View
ICML
2005
IEEE
14 years 8 months ago
Dynamic preferences in multi-criteria reinforcement learning
The current framework of reinforcement learning is based on maximizing the expected returns based on scalar rewards. But in many real world situations, tradeoffs must be made amon...
Sriraam Natarajan, Prasad Tadepalli
NIPS
2001
13 years 9 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
APIN
2004
81views more  APIN 2004»
13 years 7 months ago
Learning Generalized Policies from Planning Examples Using Concept Languages
In this paper we are concerned with the problem of learning how to solve planning problems in one domain given a number of solved instances. This problem is formulated as the probl...
Mario Martin, Hector Geffner
AGI
2011
12 years 11 months ago
Reinforcement Learning and the Bayesian Control Rule
We present an actor-critic scheme for reinforcement learning in complex domains. The main contribution is to show that planning and I/O dynamics can be separated such that an intra...
Pedro Alejandro Ortega, Daniel Alexander Braun, Si...
IJCAI
2007
13 years 9 months ago
Bayesian Inverse Reinforcement Learning
Inverse Reinforcement Learning (IRL) is the problem of learning the reward function underlying a Markov Decision Process given the dynamics of the system and the behaviour of an e...
Deepak Ramachandran, Eyal Amir