Sciweavers

337 search results - page 59 / 68
» Mean-Variance Optimization in Markov Decision Processes
Sort
View
ICML
1999
IEEE
14 years 8 months ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan
AIPS
2007
13 years 10 months ago
Learning to Plan Using Harmonic Analysis of Diffusion Models
This paper summarizes research on a new emerging framework for learning to plan using the Markov decision process model (MDP). In this paradigm, two approaches to learning to plan...
Sridhar Mahadevan, Sarah Osentoski, Jeffrey Johns,...
AAAI
2010
13 years 9 months ago
PUMA: Planning Under Uncertainty with Macro-Actions
Planning in large, partially observable domains is challenging, especially when a long-horizon lookahead is necessary to obtain a good policy. Traditional POMDP planners that plan...
Ruijie He, Emma Brunskill, Nicholas Roy
ICRA
2008
IEEE
173views Robotics» more  ICRA 2008»
14 years 2 months ago
Bayesian reinforcement learning in continuous POMDPs with application to robot navigation
— We consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are not known exactly. Partially Observable Mark...
Stéphane Ross, Brahim Chaib-draa, Joelle Pi...
WCNC
2008
IEEE
14 years 2 months ago
A Maximum-Throughput Call Admission Control Policy for CDMA Beamforming Systems
— A throughput-maximization call admission control (CAC) policy is proposed for CDMA beamforming systems in which the QoS requirements in both physical and network layers can be ...
Wei Sheng, Steven D. Blostein