We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action ...
We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identification. In practical applications...
We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
— We present a semi-parametric control policy representation and use it to solve a series of nonholonomic control problems with input state spaces of up to 7 dimensions. A neares...
Echo State Networks (ESNs) have been shown to be effective for a number of tasks, including motor control, dynamic time series prediction, and memorizing musical sequences. Howeve...
Matthew H. Tong, Adam D. Bickett, Eric M. Christia...