Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees

15 years 8 months ago

Download www.cs.cmu.edu

MDPs are an attractive formalization for planning, but realistic problems often have intractably large state spaces. When we only need a partial policy to get from a fixed start state to a goal, restricting computation to states relevant to this task can make much larger problems tractable. We introduce a new algorithm, Bounded RTDP, which can produce partial policies with strong performance guarantees while only touching a fraction of the state space, even on problems where other algorithms would have to visit the full state space. To do so, Bounded RTDP maintains both upper and lower bounds on the optimal value function. The performance of Bounded RTDP

H. Brendan McMahan, Maxim Likhachev, Geoffrey J. G

Real-time Traffic

ICML 2005 | Large State Spaces | Machine Learning | State Space | Strong Performance Guarantees |

claim paper

» Computing and Using Lower and Upper Bounds for Action Elimination in MDP Planning

» A Qdecomposition and bounded RTDP approach to resource allocation

» An OnChip Garbage Collection Coprocessor for Embedded RealTime Systems

» Nonmonotonic Lyapunov functions for stability of discrete time nonlinear and switched syst...

» Adaptive Management of Composite Services under PercentileBased Service Level Agreements

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2005
Where	ICML
Authors	H. Brendan McMahan, Maxim Likhachev, Geoffrey J. Gordon

Comments (0)

Sciweavers

Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees

ICML 2005 | Large State Spaces | Machine Learning | State Space | Strong Performance Guarantees |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers