Reinforcement learning problems are commonly tackled with temporal difference methods, which use dynamic programming and statistical sampling to estimate the long-term value of ta...
We consider the task of learning to accurately follow a trajectory in a vehicle such as a car or helicopter. A number of dynamic programming algorithms such as Differential Dynami...
J. Zico Kolter, Adam Coates, Andrew Y. Ng, Yi Gu, ...
—Reinforcement learning is a framework in which an agent can learn behavior without knowledge on a task or an environment by exploration and exploitation. Striking a balance betw...
Zhengqiao Ji, Q. M. Jonathan Wu, Maher A. Sid-Ahme...
Improving the sample efficiency of reinforcement learning algorithms to scale up to larger and more realistic domains is a current research challenge in machine learning. Model-ba...
The current evaluation functions for heuristic planning are expensive to compute. In numerous domains these functions give good guidance on the solution, so it worths the computat...