Sciweavers

ICML
2005
IEEE

Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees

15 years 19 days ago
Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees
MDPs are an attractive formalization for planning, but realistic problems often have intractably large state spaces. When we only need a partial policy to get from a fixed start state to a goal, restricting computation to states relevant to this task can make much larger problems tractable. We introduce a new algorithm, Bounded RTDP, which can produce partial policies with strong performance guarantees while only touching a fraction of the state space, even on problems where other algorithms would have to visit the full state space. To do so, Bounded RTDP maintains both upper and lower bounds on the optimal value function. The performance of Bounded RTDP
H. Brendan McMahan, Maxim Likhachev, Geoffrey J. G
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2005
Where ICML
Authors H. Brendan McMahan, Maxim Likhachev, Geoffrey J. Gordon
Comments (0)