— In this paper we study the problem of dynamic optimization of ping schedule in an active sonar buoy network deployed to provide persistent surveillance of a littoral area throu...
Abstract. Current point-based planning algorithms for solving partially observable Markov decision processes (POMDPs) have demonstrated that a good approximation of the value funct...
A longstanding goal in planning research is the ability to generalize plans developed for some set of environments to a new but similar environment, with minimal or no replanning....
Carlos Guestrin, Daphne Koller, Chris Gearhart, Ne...
— Target tracking has two variants that are often studied independently with different approaches: target searching requires a robot to find a target initially not visible, and ...
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...