Sciweavers

1233 search results - page 37 / 247
» Reinforcement Learning in MirrorBot
Sort
View
EWCBR
2006
Springer
14 years 2 months ago
Multi-agent Case-Based Reasoning for Cooperative Reinforcement Learners
Abstract. In both research fields, Case-Based Reasoning and Reinforcement Learning, the system under consideration gains its expertise from experience. Utilizing this fundamental c...
Thomas Gabel, Martin Riedmiller
IJCAI
2001
14 years 10 days ago
R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning
R-max is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-max, the agent always maintains a complet...
Ronen I. Brafman, Moshe Tennenholtz
GECCO
2009
Springer
162views Optimization» more  GECCO 2009»
13 years 8 months ago
Uncertainty handling CMA-ES for reinforcement learning
The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an ada...
Verena Heidrich-Meisner, Christian Igel
AI
2006
Springer
14 years 2 months ago
Trace Equivalence Characterization Through Reinforcement Learning
In the context of probabilistic verification, we provide a new notion of trace-equivalence divergence between pairs of Labelled Markov processes. This divergence corresponds to the...
Josee Desharnais, François Laviolette, Kris...
AAAI
2007
14 years 1 months ago
A Reinforcement Learning Algorithm with Polynomial Interaction Complexity for Only-Costly-Observable MDPs
An Unobservable MDP (UMDP) is a POMDP in which there are no observations. An Only-Costly-Observable MDP (OCOMDP) is a POMDP which extends an UMDP by allowing a particular costly a...
Roy Fox, Moshe Tennenholtz