Search Sciweavers | Sciweavers

1233 search results - page 130 / 247

» Feudal Reinforcement Learning

174

click to vote

NIPS
1996

192views Information Technology» more NIPS 1996»

Multidimensional Triangulation and Interpolation for Reinforcement Learning

15 years 7 months ago

Download www.cs.cmu.edu

Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...

Scott Davies

claim paper

Read More »

161

click to vote

AAAI
2000

139views Intelligent Agents» more AAAI 2000»

Localizing Search in Reinforcement Learning

15 years 7 months ago

Download www.cs.colorado.edu

Reinforcement learning (RL) can be impractical for many high dimensional problems because of the computational cost of doing stochastic search in large state spaces. We propose a ...

Gregory Z. Grudic, Lyle H. Ungar

claim paper

Read More »

154

click to vote

EWCBR
2006
Springer

115views Automated Reasoning» more EWCBR 2006»

Multi-agent Case-Based Reasoning for Cooperative Reinforcement Learners

15 years 9 months ago

Download ml.informatik.uni-freiburg.de

Abstract. In both research fields, Case-Based Reasoning and Reinforcement Learning, the system under consideration gains its expertise from experience. Utilizing this fundamental c...

Thomas Gabel, Martin Riedmiller

claim paper

Read More »

180

click to vote

IJCAI
2001

151views Artificial Intelligence» more IJCAI 2001»

R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning

15 years 7 months ago

Download jmlr.csail.mit.edu

R-max is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-max, the agent always maintains a complet...

Ronen I. Brafman, Moshe Tennenholtz

claim paper

Read More »

133

Voted

AI
2006
Springer

103views Artificial Intelligence» more AI 2006»

Trace Equivalence Characterization Through Reinforcement Learning

15 years 9 months ago

Download www2.ift.ulaval.ca

In the context of probabilistic verification, we provide a new notion of trace-equivalence divergence between pairs of Labelled Markov processes. This divergence corresponds to the...

Josee Desharnais, François Laviolette, Kris...

claim paper

Read More »

« Prev « First page 130 / 247 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers