Search Sciweavers | Sciweavers

128 search results - page 17 / 26

» Hierarchically Optimal Average Reward Reinforcement Learning

189

click to vote

ATAL
2009
Springer

137views Intelligent Agents» more ATAL 2009»

Generalized model learning for reinforcement learning in factored domains

16 years 2 months ago

Download userweb.cs.utexas.edu

Improving the sample eﬃciency of reinforcement learning algorithms to scale up to larger and more realistic domains is a current research challenge in machine learning. Model-ba...

Todd Hester, Peter Stone

claim paper

Read More »

205

click to vote

AAAI
2000

139views Intelligent Agents» more AAAI 2000»

Localizing Search in Reinforcement Learning

15 years 8 months ago

Download www.cs.colorado.edu

Reinforcement learning (RL) can be impractical for many high dimensional problems because of the computational cost of doing stochastic search in large state spaces. We propose a ...

Gregory Z. Grudic, Lyle H. Ungar

claim paper

Read More »

274

click to vote

NIPS
2007

207views Information Technology» more NIPS 2007»

Bayes-Adaptive POMDPs

15 years 8 months ago

Download books.nips.cc

Bayesian Reinforcement Learning has generated substantial interest recently, as it provides an elegant solution to the exploration-exploitation trade-off in reinforcement learning...

Stéphane Ross, Brahim Chaib-draa, Joelle Pi...

claim paper

Read More »

208

click to vote

INFOCOM
2012
IEEE

189views Communications» more INFOCOM 2012»

Approximately optimal adaptive learning in opportunistic spectrum access

13 years 9 months ago

Download web.eecs.umich.edu

—In this paper we develop an adaptive learning algorithm which is approximately optimal for an opportunistic spectrum access (OSA) problem with polynomial complexity. In this OSA...

Cem Tekin, Mingyan Liu

claim paper

Read More »

206

click to vote

ICML
2009
IEEE

104views Machine Learning» more ICML 2009»

Learning when to stop thinking and do something!

16 years 8 months ago

Download www.cs.ualberta.ca

An anytime algorithm is capable of returning a response to the given task at essentially any time; typically the quality of the response improves as the time increases. Here, we c...

Barnabás Póczos, Csaba Szepesv&aacut...

claim paper

Read More »

« Prev « First page 17 / 26 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers