Sciweavers

32 search results - page 2 / 7
» Learning Policies for Partially Observable Environments: Sca...
Sort
View
UAI
2000
13 years 8 months ago
Learning to Cooperate via Policy Search
Cooperative games are those in which both agents share the same payoff structure. Valuebased reinforcement-learning algorithms, such as variants of Q-learning, have been applied t...
Leonid Peshkin, Kee-Eung Kim, Nicolas Meuleau, Les...
ICML
2002
IEEE
14 years 8 months ago
Learning from Scarce Experience
Searching the space of policies directly for the optimal policy has been one popular method for solving partially observable reinforcement learning problems. Typically, with each ...
Leonid Peshkin, Christian R. Shelton
CORR
2011
Springer
161views Education» more  CORR 2011»
12 years 11 months ago
Doubly Robust Policy Evaluation and Learning
We study decision making in environments where the reward is only partially observed, but can be modeled as a function of an action and an observed context. This setting, known as...
Miroslav Dudík, John Langford, Lihong Li
UAI
2008
13 years 9 months ago
Improving Gradient Estimation by Incorporating Sensor Data
An efficient policy search algorithm should estimate the local gradient of the objective function, with respect to the policy parameters, from as few trials as possible. Whereas m...
Gregory Lawrence, Stuart J. Russell
ECML
2005
Springer
14 years 1 months ago
Model-Based Online Learning of POMDPs
Abstract. Learning to act in an unknown partially observable domain is a difficult variant of the reinforcement learning paradigm. Research in the area has focused on model-free m...
Guy Shani, Ronen I. Brafman, Solomon Eyal Shimony