Sciweavers

81 search results - page 13 / 17
» The Optimal Reward Baseline for Gradient-Based Reinforcement...
Sort
View
IIE
2007
63views more  IIE 2007»
13 years 7 months ago
Investigation of Q-Learning in the Context of a Virtual Learning Environment
We investigate the possibility to apply a known machine learning algorithm of Q-learning in the domain of a Virtual Learning Environment (VLE). It is important in this problem doma...
Dalia Baziukaite
AGENTS
1999
Springer
13 years 12 months ago
General Principles of Learning-Based Multi-Agent Systems
We consider the problem of how to design large decentralized multiagent systems (MAS’s) in an automated fashion, with little or no hand-tuning. Our approach has each agent run a...
David Wolpert, Kevin R. Wheeler, Kagan Tumer
ATAL
2010
Springer
13 years 7 months ago
PAC-MDP learning with knowledge-based admissible models
PAC-MDP algorithms approach the exploration-exploitation problem of reinforcement learning agents in an effective way which guarantees that with high probability, the algorithm pe...
Marek Grzes, Daniel Kudenko
ICML
2010
IEEE
13 years 8 months ago
Feature Selection as a One-Player Game
This paper formalizes Feature Selection as a Reinforcement Learning problem, leading to a provably optimal though intractable selection policy. As a second contribution, this pape...
Romaric Gaudel, Michèle Sebag
COLT
2010
Springer
13 years 5 months ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura