Sciweavers

81 search results - page 8 / 17
» The Optimal Reward Baseline for Gradient-Based Reinforcement...
Sort
View
ICRA
2009
IEEE
143views Robotics» more  ICRA 2009»
14 years 2 months ago
Least absolute policy iteration for robust value function approximation
Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers...
Masashi Sugiyama, Hirotaka Hachiya, Hisashi Kashim...
JAIR
2008
119views more  JAIR 2008»
13 years 7 months ago
A Multiagent Reinforcement Learning Algorithm with Non-linear Dynamics
Several multiagent reinforcement learning (MARL) algorithms have been proposed to optimize agents' decisions. Due to the complexity of the problem, the majority of the previo...
Sherief Abdallah, Victor R. Lesser
ACL
2010
13 years 5 months ago
Optimising Information Presentation for Spoken Dialogue Systems
We present a novel approach to Information Presentation (IP) in Spoken Dialogue Systems (SDS) using a data-driven statistical optimisation framework for content planning and attri...
Verena Rieser, Oliver Lemon, Xingkun Liu
ICML
2005
IEEE
14 years 8 months ago
Dynamic preferences in multi-criteria reinforcement learning
The current framework of reinforcement learning is based on maximizing the expected returns based on scalar rewards. But in many real world situations, tradeoffs must be made amon...
Sriraam Natarajan, Prasad Tadepalli
ECAL
2001
Springer
14 years 3 days ago
Evolution of Reinforcement Learning in Uncertain Environments: Emergence of Risk-Aversion and Matching
Reinforcement learning (RL) is a fundamental process by which organisms learn to achieve a goal from interactions with the environment. Using Artificial Life techniques we derive ...
Yael Niv, Daphna Joel, Isaac Meilijson, Eytan Rupp...