Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the bel...
— This paper proposes a learning framework for a CPG-based biped locomotion controller using a policy gradient method. Our goal in this study is to develop an efficient learning...
Takamitsu Matsubara, Jun Morimoto, Jun Nakanishi, ...
Abstract. Developing superior artificial board-game players is a widelystudied area of Artificial Intelligence. Among the most challenging games is the Asian game of Go, which, des...