Sciweavers

115 search results - page 11 / 23
» Recurrent policy gradients
Sort
View
NIPS
2008
13 years 9 months ago
The Recurrent Temporal Restricted Boltzmann Machine
The Temporal Restricted Boltzmann Machine (TRBM) is a probabilistic model for sequences that is able to successfully model (i.e., generate nice-looking samples of) several very hi...
Ilya Sutskever, Geoffrey E. Hinton, Graham W. Tayl...
GECCO
2011
Springer
256views Optimization» more  GECCO 2011»
12 years 11 months ago
Evolving complete robots with CPPN-NEAT: the utility of recurrent connections
This paper extends prior work using Compositional Pattern Producing Networks (CPPNs) as a generative encoding for the purpose of simultaneously evolving robot morphology and contr...
Joshua E. Auerbach, Josh C. Bongard
NIPS
2003
13 years 9 months ago
Bounded Finite State Controllers
We describe a new approximation algorithm for solving partially observable MDPs. Our bounded policy iteration approach searches through the space of bounded-size, stochastic fini...
Pascal Poupart, Craig Boutilier
ICML
2000
IEEE
14 years 8 months ago
Reinforcement Learning in POMDP's via Direct Gradient Ascent
This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...
Jonathan Baxter, Peter L. Bartlett
NN
2002
Springer
107views Neural Networks» more  NN 2002»
13 years 7 months ago
Equivariant nonstationary source separation
Most of source separation methods focus on stationary sources, so higher-order statistics is necessary for successful separation, unless sources are temporally correlated. For non...
Seungjin Choi, Andrzej Cichocki, Shun-ichi Amari