Sciweavers

473 search results - page 12 / 95
» Optimal policy switching algorithms for reinforcement learni...
Sort
View
GECCO
2006
Springer
208views Optimization» more  GECCO 2006»
13 years 11 months ago
Comparing evolutionary and temporal difference methods in a reinforcement learning domain
Both genetic algorithms (GAs) and temporal difference (TD) methods have proven effective at solving reinforcement learning (RL) problems. However, since few rigorous empirical com...
Matthew E. Taylor, Shimon Whiteson, Peter Stone
JAIR
2011
144views more  JAIR 2011»
13 years 2 months ago
Non-Deterministic Policies in Markovian Decision Processes
Markovian processes have long been used to model stochastic environments. Reinforcement learning has emerged as a framework to solve sequential planning and decision-making proble...
Mahdi Milani Fard, Joelle Pineau
COLT
2010
Springer
13 years 5 months ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura
SAB
2010
Springer
189views Optimization» more  SAB 2010»
13 years 5 months ago
TeXDYNA: Hierarchical Reinforcement Learning in Factored MDPs
Reinforcement learning is one of the main adaptive mechanisms that is both well documented in animal behaviour and giving rise to computational studies in animats and robots. In th...
Olga Kozlova, Olivier Sigaud, Christophe Meyer
NIPS
1998
13 years 9 months ago
Risk Sensitive Reinforcement Learning
In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states are those states entering which is undesirable or dangerous. We define the risk with re...
Ralph Neuneier, Oliver Mihatsch