Sciweavers

534 search results - page 3 / 107
» Markov Reward Approach to Performability and Reliability Ana...
Sort
View
JMLR
2010
189views more  JMLR 2010»
13 years 1 months ago
Adaptive Step-size Policy Gradients with Average Reward Metric
In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...
Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...
CORR
2010
Springer
171views Education» more  CORR 2010»
13 years 1 months ago
Online Learning in Opportunistic Spectrum Access: A Restless Bandit Approach
We consider an opportunistic spectrum access (OSA) problem where the time-varying condition of each channel (e.g., as a result of random fading or certain primary users' activ...
Cem Tekin, Mingyan Liu
ECML
2005
Springer
14 years 12 days ago
Active Learning in Partially Observable Markov Decision Processes
This paper examines the problem of finding an optimal policy for a Partially Observable Markov Decision Process (POMDP) when the model is not known or is only poorly specified. W...
Robin Jaulmes, Joelle Pineau, Doina Precup
CDC
2009
IEEE
133views Control Systems» more  CDC 2009»
13 years 11 months ago
Arbitrarily modulated Markov decision processes
— We consider decision-making problems in Markov decision processes where both the rewards and the transition probabilities vary in an arbitrary (e.g., nonstationary) fashion. We...
Jia Yuan Yu, Shie Mannor
INFOCOM
2008
IEEE
14 years 1 months ago
QoS Performance Analysis of Cognitive Radio-Based Virtual Wireless Networks
—Cognitive radio presents a new approach to wireless spectrum utilization and management. In this work, the potential performance improvement gained by applying cognitive radio t...
Brent Ishibashi, Nizar Bouabdallah, Raouf Boutaba