Sciweavers

337 search results - page 27 / 68
» Mean-Variance Optimization in Markov Decision Processes
Sort
View
ICML
2006
IEEE
14 years 8 months ago
Qualitative reinforcement learning
When the transition probabilities and rewards of a Markov Decision Process are specified exactly, the problem can be solved without any interaction with the environment. When no s...
Arkady Epshteyn, Gerald DeJong
ICML
2006
IEEE
14 years 8 months ago
PAC model-free reinforcement learning
For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...
ICML
2006
IEEE
14 years 8 months ago
An intrinsic reward mechanism for efficient exploration
How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...
Özgür Simsek, Andrew G. Barto
FLAIRS
2001
13 years 9 months ago
Probabilistic Planning for Behavior-Based Robots
Partially Observable Markov Decision Process models (POMDPs) have been applied to low-level robot control. We show how to use POMDPs differently, namely for sensorplanning in the ...
Amin Atrash, Sven Koenig
ICIP
2008
IEEE
14 years 2 months ago
A new theoretic framework for cross-layer optimization
Cross-layer optimization aims at improving the performance of network users operating in a time-varying, error-prone wireless environment. However, current solutions often rely on...
Fangwen Fu, Mihaela van der Schaar