Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
— In this work we present a robust detection method in outdoor scenes under cast shadows using color based invariant gradients in combination with HoG local features. The method ...
Michael Villamizar, Jorge Scandaliaris, Alberto Sa...
— Many everyday human skills can be framed in terms of performing some task subject to constraints imposed by the environment. Constraints are usually unobservable and frequently...
Matthew Howard, Stefan Klanke, Michael Gienger, Ch...
This paper presents a new method for predicting the values of policies for cloned multiple teleo-reactive robots operating in the context of exogenous events. A teleo-reactive robo...
We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...