Policy search is a successful approach to reinforcement learning. However, policy improvements often result in the loss of information. Hence, it has been marred by premature conv...
Many real-world problems are inherently hierarchically structured. The use of this structure in an agent’s policy may well be the key to improved scalability and higher performa...
We present a learning framework for Markovian decision processes that is based on optimization in the policy space. Instead of using relatively slow gradient-based optimization al...
—This paper introduces an algorithm for direct search of control policies in continuous-state discrete-action Markov decision processes. The algorithm looks for the best closed-l...
Lucian Busoniu, Damien Ernst, Bart De Schutter, Ro...