In this paper, we consider the problem of planning and learning in the infinite-horizon discounted-reward Markov decision problems. We propose a novel iterative direct policysearc...
The inherent and increasing complexity, heterogeneity and unpredictability of computer networks make the task of managing these systems highly complex. The autonomic computing para...
Romildo Martins da Silva Bezerra, Joberto Sé...
Tetris is a falling block game where the player’s objective is to arrange a sequence of different shaped tetrominoes smoothly in order to survive. In the intelligence games, ag...
Abstract— We propose to improve the locomotive performance of humanoid robots by using approximated biped stepping and walking dynamics with reinforcement learning (RL). Although...
Jun Morimoto, Christopher G. Atkeson, Gen Endo, Go...
The idea of building query-oriented routing indices has changed the way of improving keyword search efficiency from the basis as it can learn the content distribution from the que...