Search Sciweavers | Sciweavers

536 search results - page 52 / 108

» Residual Algorithms: Reinforcement Learning with Function Ap...

200

click to vote

IJHIS
2006

94views more IJHIS 2006»

A new fine-grained evolutionary algorithm based on cellular learning automata

15 years 7 months ago

Download ceit.aut.ac.ir

In this paper, a new evolutionary computing model, called CLA-EC, is proposed. This model is a combination of a model called cellular learning automata (CLA) and the evolutionary ...

Reza Rastegar, Mohammad Reza Meybodi, Arash Hariri

claim paper

Read More »

187

click to vote

ML
2002
ACM

154views Machine Learning» more ML 2002»

Technical Update: Least-Squares Temporal Difference Learning

15 years 6 months ago

Download www.research.rutgers.edu

TD() is a popular family of algorithms for approximate policy evaluation in large MDPs. TD() works by incrementally updating the value function after each observed transition. It h...

Justin A. Boyan

claim paper

Read More »

178

click to vote

AIPS
2007

104views Artificial Intelligence» more AIPS 2007»

Discovering Relational Domain Features for Probabilistic Planning

15 years 9 months ago

Download cobweb.ecn.purdue.edu

In sequential decision-making problems formulated as Markov decision processes, state-value function approximation using domain features is a critical technique for scaling up the...

Jia-Hong Wu, Robert Givan

claim paper

Read More »

178

click to vote

AAAI
2010

171views Intelligent Agents» more AAAI 2010»

Multi-Agent Learning with Policy Prediction

15 years 8 months ago

Download www.cs.umass.edu

Due to the non-stationary environment, learning in multi-agent systems is a challenging problem. This paper first introduces a new gradient-based learning algorithm, augmenting th...

Chongjie Zhang, Victor R. Lesser

claim paper

Read More »

206

click to vote

NIPS
2008

110views Information Technology» more NIPS 2008»

Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms

15 years 8 months ago

Download groups.csail.mit.edu

Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...

John W. Roberts, Russ Tedrake

claim paper

Read More »

« Prev « First page 52 / 108 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers