A multi-armed bandit episode consists of n trials, each allowing selection of one of K arms, resulting in payoff from a distribution over [0, 1] associated with that arm. We assum...
A new stochastic clustering algorithm is introduced that aims to locate all the local minima of a multidimensional continuous and differentiable function inside a bounded domain. ...
Abstract— We propose to improve the locomotive performance of humanoid robots by using approximated biped stepping and walking dynamics with reinforcement learning (RL). Although...
Jun Morimoto, Christopher G. Atkeson, Gen Endo, Go...
We give the first efficient parallel algorithms for solving the arrangement problem. We give a deterministic algorithm for the CREW PRAM which runs in nearly optimal bounds of O(lo...
A robust dynamic evolutionary algorithm (labeled RODEA), where both the robust calculation and mutation operator are based on an orthogonal design, is proposed in this paper. Prev...
Sanyou Y. Zeng, Rui Wang, Hui Shi, Guang Chen, Hug...