Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers...
Recent psychological and neurological evidence suggests that biological object recognition is a process of matching sensed images to stored iconic memories. This paper presents a p...
Using adaptive, personalized courses is rewarding, as it can create a better learning experience, tailored for a specific learner’s needs. The process of creating these courses,...
In reinforcement learning, an agent tries to learn a policy, i.e., how to select an action in a given state of the environment, so that it maximizes the total amount of reward it ...
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...