Recent developments in the area of reinforcement learning have yielded a number of new algorithms for the prediction and control of Markovian environments. These algorithms,includ...
Tommi Jaakkola, Michael I. Jordan, Satinder P. Sin...
When controlling dynamic systems, such as mobile robots in uncertain environments, there is a trade off between risk and reward. For example, a race car can turn a corner faster b...
As shown in [7], optimal control problems with either ODE or PDE dynamics can be solved efficiently using a setting of consistent approximations obtained by numerical discretizati...
Internet search companies sell advertisement slots based on users’ search queries via an auction. Advertisers have to solve a complex optimization problem of how to place bids o...
Abstract-- We consider reinforcement learning, and in particular, the Q-learning algorithm in large state and action spaces. In order to cope with the size of the spaces, a functio...