We consider the policy search approach to reinforcement learning. We show that if a “baseline distribution” is given (indicating roughly how often we expect a good policy to v...
J. Andrew Bagnell, Sham Kakade, Andrew Y. Ng, Jeff...
The one-step anticipatory algorithm (1s-AA) is an online algorithm making decisions under uncertainty by ignoring future non-anticipativity constraints. It makes near-optimal decis...
Abstract. This paper presents a new method for studying protein folding kinetics. It uses the recently introduced Stochastic Roadmap Simulation (SRS) method to estimate the transit...
Tsung-Han Chiang, Mehmet Serkan Apaydin, Douglas L...
We present a parameter inference algorithm for autonomous stochastic linear hybrid systems, which computes a maximum-likelihood model, given only a set of continuous output data of...
Abstract. We address the problem of continuous stochastic optimal control in the presence of hard obstacles. Due to the non-smooth character of the obstacles, the traditional appro...