We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
— While the Partially Observable Markov Decision Process (POMDP) provides a formal framework for the problem of robot control under uncertainty, it typically assumes a known and ...
We use concepts from chaos theory in order to model
nonlinear dynamical systems that exhibit deterministic behavior.
Observed time series from such a system can be embedded
into...
This paper explores the issues faced in creating a sys-4 tem that can learn tactical human behavior merely by observing5 a human perform the behavior in a simulation. More specific...
We report results of an interdisciplinary project which aims at endowing a real robot system with the capacity for learning by goaldirected imitation. The control architecture is b...
Wolfram Erlhagen, Albert Mukovskiy, Estela Bicho, ...