Search Sciweavers | Sciweavers

95 search results - page 10 / 19

» Policy Gradients for Cryptanalysis

184

click to vote

NIPS
2007

164views Information Technology» more NIPS 2007»

Incremental Natural Actor-Critic Algorithms

15 years 8 months ago

Download books.nips.cc

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...

Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...

claim paper

Read More »

200

click to vote

INFOCOM
1995
IEEE

122views Communications» more INFOCOM 1995»

Complexity of Gradient Projection Method for Optimal Routing in Data Networks

15 years 10 months ago

Download www.cs.ou.edu

—In this paper, we derive a time-complexity bound for the gradient projection method for optimal routing in data networks. This result shows that the gradient projection algorith...

Wei Kang Tsai, John K. Antonio, Garng M. Huang

claim paper

Read More »

190

Voted

DATE
2008
IEEE

99views Hardware» more DATE 2008»

Thermal Balancing Policy for Streaming Computing on Multiprocessor Architectures

16 years 1 months ago

Download www.date-conference.com

As feature sizes decrease, power dissipation and heat generation density exponentially increase. Thus, temperature gradients in Multiprocessor Systems on Chip (MPSoCs) can serious...

Fabrizio Mulas, Michele Pittau, Marco Buttu, Salva...

claim paper

Read More »

179

click to vote

ICANNGA
2007
Springer

105views Algorithms» more ICANNGA 2007»

Reinforcement Learning in Fine Time Discretization

16 years 1 months ago

Download staff.elka.pw.edu.pl

Reinforcement Learning (RL) is analyzed here as a tool for control system optimization. State and action spaces are assumed to be continuous. Time is assumed to be discrete, yet th...

Pawel Wawrzynski

claim paper

Read More »

182

click to vote

NIPS
1998

140views Information Technology» more NIPS 1998»

Gradient Descent for General Reinforcement Learning

15 years 8 months ago

Download www.ri.cmu.edu

A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide range of new reinforcementlearning algorithms. These algorithms solve a number ...

Leemon C. Baird III, Andrew W. Moore

claim paper

Read More »

« Prev « First page 10 / 19 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers