Search Sciweavers | Sciweavers

133 search results - page 7 / 27

» Hierarchical Policy Gradient Algorithms

click to vote

JMLR
2012

229views Programming Languages» more JMLR 2012»

Hierarchical Relative Entropy Policy Search

12 years 8 days ago

Download www.ias.informatik.tu-darmstadt.de

Many real-world problems are inherently hierarchically structured. The use of this structure in an agent’s policy may well be the key to improved scalability and higher performa...

Christian Daniel, Gerhard Neumann, Jan Peters

claim paper

Read More »

click to vote

INFOCOM
1995
IEEE

122views Communications» more INFOCOM 1995»

Complexity of Gradient Projection Method for Optimal Routing in Data Networks

14 years 1 months ago

Download www.cs.ou.edu

—In this paper, we derive a time-complexity bound for the gradient projection method for optimal routing in data networks. This result shows that the gradient projection algorith...

Wei Kang Tsai, John K. Antonio, Garng M. Huang

claim paper

Read More »

click to vote

ICANNGA
2007
Springer

105views Algorithms» more ICANNGA 2007»

Reinforcement Learning in Fine Time Discretization

14 years 4 months ago

Download staff.elka.pw.edu.pl

Reinforcement Learning (RL) is analyzed here as a tool for control system optimization. State and action spaces are assumed to be continuous. Time is assumed to be discrete, yet th...

Pawel Wawrzynski

claim paper

Read More »

click to vote

AIPS
2003

149views Artificial Intelligence» more AIPS 2003»

Synthesis of Hierarchical Finite-State Controllers for POMDPs

13 years 11 months ago

Download www.aaai.org

We develop a hierarchical approach to planning for partially observable Markov decision processes (POMDPs) in which a policy is represented as a hierarchical ﬁnite-state control...

Eric A. Hansen, Rong Zhou

claim paper

Read More »

click to vote

AAAI
2011

144views Intelligent Agents» more AAAI 2011»

Differential Eligibility Vectors for Advantage Updating and Gradient Methods

12 years 9 months ago

Download gaips.inesc-id.pt

In this paper we propose differential eligibility vectors (DEV) for temporal-difference (TD) learning, a new class of eligibility vectors designed to bring out the contribution of...

Francisco S. Melo

claim paper

Read More »

« Prev « First page 7 / 27 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers