Search Sciweavers | Sciweavers

473 search results - page 73 / 95

» Optimal policy switching algorithms for reinforcement learni...

173

click to vote

CORR
2011
Springer

209views Education» more CORR 2011»

Close the Gaps: A Learning-while-Doing Algorithm for a Class of Single-Product Revenue Management Problems

14 years 10 months ago

Download www.stanford.edu

In this work, we consider a retailer selling a single product with limited on-hand inventory over a ﬁnite selling season. Customer demand arrives according to a Poisson process,...

Zizhuo Wang, Shiming Deng, Yinyu Ye

claim paper

Read More »

169

click to vote

ICML
2009
IEEE

186views Machine Learning» more ICML 2009»

Regularization and feature selection in least-squares temporal difference learning

16 years 7 months ago

Download ai.stanford.edu

We consider the task of reinforcement learning with linear value function approximation. Temporal difference algorithms, and in particular the Least-Squares Temporal Difference (L...

J. Zico Kolter, Andrew Y. Ng

claim paper

Read More »

190

click to vote

JSAC
2011

159views more JSAC 2011»

An Anti-Jamming Stochastic Game for Cognitive Radio Networks

15 years 1 months ago

Download sig.umd.edu

—Various spectrum management schemes have been proposed in recent years to improve the spectrum utilization in cognitive radio networks. However, few of them have considered the ...

Beibei Wang, Yongle Wu, K. J. Ray Liu, T. Charles ...

claim paper

Read More »

165

click to vote

ICML
2001
IEEE

145views Machine Learning» more ICML 2001»

Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning

16 years 7 months ago

Download www-2.cs.cmu.edu

This paper examines the notion of symmetry in Markov decision processes (MDPs). We define symmetry for an MDP and show how it can be exploited for more effective learning in singl...

Martin Zinkevich, Tucker R. Balch

claim paper

Read More »

184

click to vote

NIPS
2008

271views Information Technology» more NIPS 2008»

Goal-directed decision making in prefrontal cortex: a computational framework

15 years 7 months ago

Download www.princeton.edu

Research in animal learning and behavioral neuroscience has distinguished between two forms of action control: a habit-based form, which relies on stored action values, and a goal...

Matthew Botvinick, James An

claim paper

Read More »

« Prev « First page 73 / 95 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers