We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...
The computation of a rigid body transformation which optimally aligns a set of measurement points with a surface and related registration problems are studied from the viewpoint o...
Helmut Pottmann, Qi-Xing Huang, Yong-Liang Yang, S...
Stochastic games are a generalization of MDPs to multiple agents, and can be used as a framework for investigating multiagent learning. Hu and Wellman (1998) recently proposed a m...
Although tabular reinforcement learning (RL) methods have been proved to converge to an optimal policy, the combination of particular conventional reinforcement learning techniques...
In this paper, an edge-preserving nonlinear iterative regularization-based image resampling method for a single noise-free image is proposed. Several aspects of the resampling alg...