Sciweavers

67 search results - page 7 / 14
» Limits of Multi-Discounted Markov Decision Processes
Sort
View
JMLR
2010
125views more  JMLR 2010»
13 years 2 months ago
Variational methods for Reinforcement Learning
We consider reinforcement learning as solving a Markov decision process with unknown transition distribution. Based on interaction with the environment, an estimate of the transit...
Thomas Furmston, David Barber
CALCO
2007
Springer
100views Mathematics» more  CALCO 2007»
14 years 1 months ago
Applications of Metric Coinduction
Metric coinduction is a form of coinduction that can be used to establish properties of objects constructed as a limit of finite approximations. One can prove a coinduction step s...
Dexter Kozen, Nicholas Ruozzi
ICRA
2010
IEEE
133views Robotics» more  ICRA 2010»
13 years 6 months ago
Variable resolution decomposition for robotic navigation under a POMDP framework
— Partially Observable Markov Decision Processes (POMDPs) offer a powerful mathematical framework for making optimal action choices in noisy and/or uncertain environments, in par...
Robert Kaplow, Amin Atrash, Joelle Pineau
ICASSP
2009
IEEE
13 years 11 months ago
Evolution of social P2P networks based on the dynamics of heterogeneous multimedia peers
In this paper, we consider social peer-to-peer (P2P) networks, where peers are sharing their resources (i.e., multimedia content and upload bandwidth). In the considered P2P netwo...
Hyunggon Park, Mihaela van der Schaar
ICC
2007
IEEE
121views Communications» more  ICC 2007»
14 years 1 months ago
Structure and Optimality of Myopic Sensing for Opportunistic Spectrum Access
We consider opportunistic spectrum access for secondary users over multiple channels whose occupancy by primary users is modeled as discrete-time Markov processes. Due to hardware...
Qing Zhao, Bhaskar Krishnamachari