Sciweavers

EWRL
2008
13 years 10 months ago
Efficient Reinforcement Learning in Parameterized Models: Discrete Parameter Case
We consider reinforcement learning in the parameterized setup, where the model is known to belong to a parameterized family of Markov Decision Processes (MDPs). We further impose ...
Kirill Dyagilev, Shie Mannor, Nahum Shimkin
EWRL
2008
13 years 10 months ago
Variable Metric Reinforcement Learning Methods Applied to the Noisy Mountain Car Problem
Two variable metric reinforcement learning methods, the natural actor-critic algorithm and the covariance matrix adaptation evolution strategy, are compared on a conceptual level a...
Verena Heidrich-Meisner, Christian Igel
EWRL
2008
13 years 10 months ago
New Error Bounds for Approximations from Projected Linear Equations
We consider linear fixed point equations and their approximations by projection on a low dimensional subspace. We derive new bounds on the approximation error of the solution, whi...
Huizhen Yu, Dimitri P. Bertsekas
EWRL
2008
13 years 10 months ago
Probabilistic Inference for Fast Learning in Control
Carl Edward Rasmussen, Marc Peter Deisenroth
EWRL
2008
13 years 10 months ago
Markov Decision Processes with Arbitrary Reward Processes
Abstract. We consider a control problem where the decision maker interacts with a standard Markov decision process with the exception that the reward functions vary arbitrarily ove...
Jia Yuan Yu, Shie Mannor, Nahum Shimkin
EWRL
2008
13 years 10 months ago
Optimistic Planning of Deterministic Systems
If one possesses a model of a controlled deterministic system, then from any state, one may consider the set of all possible reachable states starting from that state and using any...
Jean-François Hren, Rémi Munos
EWRL
2008
13 years 10 months ago
Regularized Fitted Q-Iteration: Application to Planning
We consider planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment. We propose to use fitted Q-i...
Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...
EWRL
2008
13 years 10 months ago
Exploiting Additive Structure in Factored MDPs for Reinforcement Learning
Thomas Degris, Olivier Sigaud, Pierre-Henri Wuille...
EWRL
2008
13 years 10 months ago
Policy Learning - A Unified Perspective with Applications in Robotics
Policy Learning approaches are among the best suited methods for high-dimensional, continuous control systems such as anthropomorphic robot arms and humanoid robots. In this paper,...
Jan Peters, Jens Kober, Duy Nguyen-Tuong