Estimation of distribution algorithms replace the typical crossover and mutation operators by constructing a probabilistic model and generating offspring according to this model....
Widespread adoption of reconfigurable devices requires system level synthesis techniques to take an application written in a high level language and map it to the reconfigurable d...
Abstract— We characterize the stability and achievable performance of networked estimation under correlated packet losses described by the Gilbert-Elliot model. For scalar contin...
— Many decision systems rely on a precisely known Markov Chain model to guarantee optimal performance, and this paper considers the online estimation of unknown, nonstationary Ma...
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...