value function | Sciweavers

53

EOR
2006

90views more EOR 2006»

A parallelizable dynamic fleet management model with random travel times

14 years 6 months ago

In this paper, we present a stochastic model for the dynamic fleet management problem with random travel times. Our approach decomposes the problem into time-staged subproblems by...

Huseyin Topaloglu

claim paper

Read More »

46

click to vote

AAMAS
2007
Springer

142views Intelligent Agents» more AAMAS 2007»

Parallel Reinforcement Learning with Linear Function Approximation

14 years 6 months ago

Download www.aamas-conference.org

In this paper, we investigate the use of parallelization in reinforcement learning (RL), with the goal of learning optimal policies for single-agent RL problems more quickly by us...

Matthew Grounds, Daniel Kudenko

claim paper

Read More »

53

click to vote

ICML
2010
IEEE

167views Machine Learning» more ICML 2010»

Finite-Sample Analysis of LSTD

14 years 7 months ago

Download hal.inria.fr

In this paper we consider the problem of policy evaluation in reinforcement learning, i.e., learning the value function of a fixed policy, using the least-squares temporal-differe...

Alessandro Lazaric, Mohammad Ghavamzadeh, Ré...

claim paper

Read More »

54

click to vote

ICML
2010
IEEE

282views Machine Learning» more ICML 2010»

Bayesian Multi-Task Reinforcement Learning

14 years 7 months ago

Download hal.inria.fr

We consider the problem of multi-task reinforcement learning where the learner is provided with a set of tasks, for which only a small number of samples can be generated for any g...

Alessandro Lazaric, Mohammad Ghavamzadeh

claim paper

Read More »

57

click to vote

NIPS
1993

134views Information Technology» more NIPS 1993»

Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming

14 years 7 months ago

Download www.cs.cmu.edu

Dynamic programming provides a methodology to develop planners and controllers for nonlinear systems. However, general dynamic programming is computationally intractable. We have ...

Christopher G. Atkeson

claim paper

Read More »

64

click to vote

NIPS
1998

137views Information Technology» more NIPS 1998»

Risk Sensitive Reinforcement Learning

14 years 7 months ago

Download www.cs.cmu.edu

In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states are those states entering which is undesirable or dangerous. We define the risk with re...

Ralph Neuneier, Oliver Mihatsch

claim paper

Read More »

52

click to vote

AAAI
1997

133views Intelligent Agents» more AAAI 1997»

Incremental Methods for Computing Bounds in Partially Observable Markov Decision Processes

14 years 7 months ago

Download www.cs.pitt.edu

Partially observable Markov decision processes (POMDPs) allow one to model complex dynamic decision or control problems that include both action outcome uncertainty and imperfect ...

Milos Hauskrecht

claim paper

Read More »

60

click to vote

AAAI
1998

181views Intelligent Agents» more AAAI 1998»

Applying Online Search Techniques to Continuous-State Reinforcement Learning

14 years 7 months ago

Download www.autonlab.org

In this paper, we describe methods for e ciently computing better solutions to control problems in continuous state spaces. We provide algorithms that exploit online search to boo...

Scott Davies, Andrew Y. Ng, Andrew W. Moore

claim paper

Read More »

46

click to vote

NIPS
2003

167views Information Technology» more NIPS 2003»

Applying Metric-Trees to Belief-Point POMDPs

14 years 7 months ago

Download books.nips.cc

Recent developments in grid-based and point-based approximation algorithms for POMDPs have greatly improved the tractability of POMDP planning. These approaches operate on sets of...

Joelle Pineau, Geoffrey J. Gordon, Sebastian Thrun

claim paper

Read More »

48

click to vote

NIPS
2001

158views Information Technology» more NIPS 2001»

Multiagent Planning with Factored MDPs

14 years 7 months ago

Download books.nips.cc

We present a principled and efficient planning algorithm for cooperative multiagent dynamic systems. A striking feature of our method is that the coordination and communication be...

Carlos Guestrin, Daphne Koller, Ronald Parr

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers