This work is concerned with the efficient design of a reverse logistics network using an extended version of models currently found in the literature. Those traditional, basic mo...
MDPs are an attractive formalization for planning, but realistic problems often have intractably large state spaces. When we only need a partial policy to get from a fixed start s...
H. Brendan McMahan, Maxim Likhachev, Geoffrey J. G...
The success ofreinforcement learninginpractical problems depends on the ability to combine function approximation with temporal di erence methods such as value iteration. Experime...
Spoken dialogue management strategy optimization by means of Reinforcement Learning (RL) is now part of the state of the art. Yet, there is still a clear mismatch between the comp...
— This paper presents a novel swarm approximate dynamic programming method (swarm-ADP) for parameter optimization of PSO systems, from the perspective of optimal control. Based o...