Search Sciweavers | Sciweavers

771 search results - page 71 / 155

» Markov Decision Processes with Arbitrary Reward Processes

143

click to vote

JMLR
2008

129views more JMLR 2008»

Finite-Time Bounds for Fitted Value Iteration

15 years 4 months ago

Download www.sztaki.hu

In this paper we develop a theoretical analysis of the performance of sampling-based fitted value iteration (FVI) to solve infinite state-space, discounted-reward Markovian decisi...

Rémi Munos, Csaba Szepesvári

claim paper

Read More »

144

Voted

ICASSP
2010
IEEE

224views Signal Processing» more ICASSP 2010»

Distributed learning in cognitive radio networks: Multi-armed bandit with distributed multiple players

15 years 4 months ago

Download www.ece.ucdavis.edu

—We consider a cognitive radio network with distributed multiple secondary users, where each user independently searches for spectrum opportunities in multiple channels without e...

Keqin Liu, Qing Zhao

claim paper

Read More »

140

click to vote

ICC
2007
IEEE

121views Communications» more ICC 2007»

Structure and Optimality of Myopic Sensing for Opportunistic Spectrum Access

15 years 10 months ago

Download www.ece.ucdavis.edu

We consider opportunistic spectrum access for secondary users over multiple channels whose occupancy by primary users is modeled as discrete-time Markov processes. Due to hardware...

Qing Zhao, Bhaskar Krishnamachari

claim paper

Read More »

131

click to vote

ICML
2006
IEEE

131views Machine Learning» more ICML 2006»

PAC model-free reinforcement learning

16 years 5 months ago

Download cseweb.ucsd.edu

For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...

Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...

claim paper

Read More »

click to vote

NIPS
2004

128views Information Technology» more NIPS 2004»

A Cost-Shaping LP for Bellman Error Minimization with Performance Guarantees

15 years 5 months ago

Download books.nips.cc

We introduce a new algorithm based on linear programming that approximates the differential value function of an average-cost Markov decision process via a linear combination of p...

Daniela Pucci de Farias, Benjamin Van Roy

claim paper

Read More »

« Prev « First page 71 / 155 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers