This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...
Abstract— Continuous action sets are used in many reinforcement learning (RL) applications in robot control since the control input is continuous. However, discrete action sets a...
Akihiko Yamaguchi, Jun Takamatsu, Tsukasa Ogasawar...
The resource constraint project scheduling problem (RCPSP) is an NP-hard benchmark problem in scheduling which takes into account the limitation of resources’ availabilities in ...
Abstract— This paper presents a learning system that uses Qlearning with a resource allocating network (RAN) for behavior learning in mobile robotics. The RAN is used as a functi...
In this paper, we show how adaptive prototype optimization can be used to improve the performance of function approximation based on Kanerva Coding when solving largescale instanc...