Sciweavers

81 search results - page 9 / 17
» The Optimal Reward Baseline for Gradient-Based Reinforcement...
Sort
View
ATAL
2007
Springer
14 years 1 months ago
Theoretical advantages of lenient Q-learners: an evolutionary game theoretic perspective
This paper presents the dynamics of multiple reinforcement learning agents from an Evolutionary Game Theoretic (EGT) perspective. We provide a Replicator Dynamics model for tradit...
Liviu Panait, Karl Tuyls
JAIR
2000
131views more  JAIR 2000»
13 years 7 months ago
An Application of Reinforcement Learning to Dialogue Strategy Selection in a Spoken Dialogue System for Email
This paper describes a novel method by which a spoken dialogue system can learn to choose an optimal dialogue strategy from its experience interacting with human users. The method...
Marilyn A. Walker
ICML
2006
IEEE
14 years 8 months ago
Qualitative reinforcement learning
When the transition probabilities and rewards of a Markov Decision Process are specified exactly, the problem can be solved without any interaction with the environment. When no s...
Arkady Epshteyn, Gerald DeJong
IROS
2009
IEEE
206views Robotics» more  IROS 2009»
14 years 2 months ago
Bayesian reinforcement learning in continuous POMDPs with gaussian processes
— Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical model to handle realworld sequential decision processes but require a known model to be solv...
Patrick Dallaire, Camille Besse, Stéphane R...
AAMAS
2005
Springer
14 years 1 months ago
Learning to Coordinate Using Commitment Sequences in Cooperative Multi-agent Systems
We report on an investigation of the learning of coordination in cooperative multi-agent systems. Specifically, we study solutions that are applicable to independent agents i.e. ...
Spiros Kapetanakis, Daniel Kudenko, Malcolm J. A. ...