We present a globally convergent method for regularized risk minimization problems. Our method applies to Support Vector estimation, regression, Gaussian Processes, and any other ...
Reward shaping is a well-known technique applied to help reinforcement-learning agents converge more quickly to nearoptimal behavior. In this paper, we introduce social reward sha...
Monica Babes, Enrique Munoz de Cote, Michael L. Li...
The application of reinforcement learning algorithms to Partially Observable Stochastic Games (POSG) is challenging since each agent does not have access to the whole state inform...
Alessandro Lazaric, Mario Quaresimale, Marcello Re...
We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...
Hierarchical state decompositions address the curse-ofdimensionality in Q-learning methods for reinforcement learning (RL) but can suffer from suboptimality. In addressing this, w...
Erik G. Schultink, Ruggiero Cavallo, David C. Park...