Gaussian Process Temporal Difference (GPTD) learning offers a Bayesian solution to the policy evaluation problem of reinforcement learning. In this paper we extend the GPTD framew...
Temporal difference (TD) learning methods [22] have become popular reinforcement learning techniques in recent years. TD methods have had some experimental successes and have been...
—Reinforcement learning (RL) is a valuable learning method when the systems require a selection of control actions whose consequences emerge over long periods for which input– ...
We consider event dependent routing algorithms for on-line explicit source routing in MPLS networks. The proposed methods are based on load shared sequential routing in which load ...
—Reinforcement learning is the scheme for unsupervised learning in which robots are expected to acquire behavior skills through self-explorations based on reward signals. There a...
Hiroaki Arie, Tetsuya Ogata, Jun Tani, Shigeki Sug...