Sciweavers

109 search results - page 10 / 22
» Model Checking Markov Reward Models with Impulse Rewards
Sort
View
ICRA
2007
IEEE
126views Robotics» more  ICRA 2007»
14 years 4 months ago
A formal framework for robot learning and control under model uncertainty
— While the Partially Observable Markov Decision Process (POMDP) provides a formal framework for the problem of robot control under uncertainty, it typically assumes a known and ...
Robin Jaulmes, Joelle Pineau, Doina Precup
CORR
2010
Springer
143views Education» more  CORR 2010»
13 years 7 months ago
The Non-Bayesian Restless Multi-Armed Bandit: a Case of Near-Logarithmic Regret
In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are N arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A play...
Wenhan Dai, Yi Gai, Bhaskar Krishnamachari, Qing Z...
ICASSP
2008
IEEE
14 years 4 months ago
Link throughput of multi-channel opportunistic access with limited sensing
—We aim to characterize the maximum link throughput of a multi-channel opportunistic communication system. The states of these channels evolve as independent and identically dist...
Keqin Liu, Qing Zhao
NECO
2007
150views more  NECO 2007»
13 years 10 months ago
Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule
Learning agents, whether natural or artificial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plasticity is ...
Dorit Baras, Ron Meir
JMLR
2010
135views more  JMLR 2010»
13 years 5 months ago
Finite-sample Analysis of Bellman Residual Minimization
We consider the Bellman residual minimization approach for solving discounted Markov decision problems, where we assume that a generative model of the dynamics and rewards is avai...
Odalric-Ambrym Maillard, Rémi Munos, Alessa...