Abstract— We propose to improve the locomotive performance of humanoid robots by using approximated biped stepping and walking dynamics with reinforcement learning (RL). Although...
Jun Morimoto, Christopher G. Atkeson, Gen Endo, Go...
Abstract. In this work, we address the problem of transient and steadystate analysis of a stochastic Petri net which includes non Markovian distributions with a finite support but ...
We present BL-WoLF, a framework for learnability in repeated zero-sum games where the cost of learning is measured by the losses the learning agent accrues (rather than the number...
We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
As postgenomic biology becomes more predictive, the ability to infer rate parameters of genetic and biochemical networks will become increasingly important. In this paper, we expl...