We investigate methods for planning in a Markov Decision Process where the cost function is chosen by an adversary after we fix our policy. As a running example, we consider a rob...
H. Brendan McMahan, Geoffrey J. Gordon, Avrim Blum
We introduce a new EM framework in which it is possible not only to optimize the model parameters but also the number of model components. A key feature of our approach is that we...
The application of Reinforcement Learning (RL) algorithms to learn tasks for robots is often limited by the large dimension of the state space, which may make prohibitive its appli...
Andrea Bonarini, Alessandro Lazaric, Marcello Rest...
It is known that the complexity of the reinforcement learning algorithms, such as Q-learning, may be exponential in the number of environment’s states. It was shown, however, th...
Reinforcement learning is a paradigm under which an agent seeks to improve its policy by making learning updates based on the experiences it gathers through interaction with the en...