Search Sciweavers | Sciweavers

115 search results - page 12 / 23

» Recurrent policy gradients

154

click to vote

IOR
2011

107views more IOR 2011»

Information Collection on a Graph

15 years 13 days ago

Download www.castlelab.princeton.edu

We derive a knowledge gradient policy for an optimal learning problem on a graph, in which we use sequential measurements to reﬁne Bayesian estimates of individual edge values i...

Ilya O. Ryzhov, Warren B. Powell

claim paper

Read More »

148

click to vote

KES
2007
Springer

146views Information Technology» more KES 2007»

Making Financial Trading by Recurrent Reinforcement Learning

15 years 11 months ago

Download www.sms.dsems.unile.it

In this paper we propose a ﬁnancial trading system whose strategy is developed by means of an artiﬁcial neural network approach based on a recurrent reinforcement learning algo...

Francesco Bertoluzzo, Marco Corazza

claim paper

Read More »

164

click to vote

PE
2010
Springer

133views Optimization» more PE 2010»

Positive Harris recurrence and diffusion scale analysis of a push pull queueing network

15 years 3 months ago

Download stat.haifa.ac.il

We consider a push pull queueing system with two servers and two types of jobs which are processed by the two servers in opposite order, with stochastic generally distributed proc...

Yoni Nazarathy, Gideon Weiss

claim paper

Read More »

157

click to vote

NIPS
2007

164views Information Technology» more NIPS 2007»

Incremental Natural Actor-Critic Algorithms

15 years 7 months ago

Download books.nips.cc

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...

Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...

claim paper

Read More »

163

click to vote

INFOCOM
1995
IEEE

122views Communications» more INFOCOM 1995»

Complexity of Gradient Projection Method for Optimal Routing in Data Networks

15 years 9 months ago

Download www.cs.ou.edu

—In this paper, we derive a time-complexity bound for the gradient projection method for optimal routing in data networks. This result shows that the gradient projection algorith...

Wei Kang Tsai, John K. Antonio, Garng M. Huang

claim paper

Read More »

« Prev « First page 12 / 23 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers