Search Sciweavers | Sciweavers

201 search results - page 33 / 41

» Solving Concurrent Markov Decision Processes

122

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

15 years 9 months ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

124

click to vote

JAIR
2006

120views more JAIR 2006»

FluCaP: A Heuristic Search Planner for First-Order MDPs

15 years 3 months ago

Download www.jair.org

We present a heuristic search algorithm for solving first-order Markov Decision Processes (FOMDPs). Our approach combines first-order state abstraction that avoids evaluating stat...

Steffen Hölldobler, Eldar Karabaev, Olga Skvo...

claim paper

Read More »

153

click to vote

JAIR
2010

115views more JAIR 2010»

An Investigation into Mathematical Programming for Finite Horizon Decentralized POMDPs

15 years 1 months ago

Download www.jair.org

Decentralized planning in uncertain environments is a complex task generally dealt with by using a decision-theoretic approach, mainly through the framework of Decentralized Parti...

Raghav Aras, Alain Dutech

claim paper

Read More »

156

click to vote

ICML
1999
IEEE

168views Machine Learning» more ICML 1999»

Least-Squares Temporal Difference Learning

16 years 3 months ago

Download www.research.rutgers.edu

Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...

Justin A. Boyan

claim paper

Read More »

145

click to vote

AIPS
2007

174views Artificial Intelligence» more AIPS 2007»

Learning to Plan Using Harmonic Analysis of Diffusion Models

15 years 5 months ago

Download www.cs.umass.edu

This paper summarizes research on a new emerging framework for learning to plan using the Markov decision process model (MDP). In this paradigm, two approaches to learning to plan...

Sridhar Mahadevan, Sarah Osentoski, Jeffrey Johns,...

claim paper

Read More »

« Prev « First page 33 / 41 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers