Sciweavers

115 search results - page 9 / 23
» Recurrent policy gradients
Sort
View
AAAI
2010
13 years 9 months ago
Multi-Agent Learning with Policy Prediction
Due to the non-stationary environment, learning in multi-agent systems is a challenging problem. This paper first introduces a new gradient-based learning algorithm, augmenting th...
Chongjie Zhang, Victor R. Lesser
FLAIRS
2004
13 years 9 months ago
Recurrent Neural Networks and Pitch Representations for Music Tasks
We present results from experiments in using several pitch representations for jazz-oriented musical tasks performed by a recurrent neural network. We have run experiments with se...
Judy A. Franklin
NN
1998
Springer
108views Neural Networks» more  NN 1998»
13 years 7 months ago
How embedded memory in recurrent neural network architectures helps learning long-term temporal dependencies
Learning long-term temporal dependencies with recurrent neural networks can be a difficult problem. It has recently been shown that a class of recurrent neural networks called NA...
Tsungnan Lin, Bill G. Horne, C. Lee Giles
ICML
2009
IEEE
14 years 8 months ago
Monte-Carlo simulation balancing
In this paper we introduce the first algorithms for efficiently learning a simulation policy for Monte-Carlo search. Our main idea is to optimise the balance of a simulation polic...
David Silver, Gerald Tesauro
ORL
2008
68views more  ORL 2008»
13 years 7 months ago
On polynomial cases of the unichain classification problem for Markov Decision Processes
The unichain classification problem detects whether a finite state and action MDP is unichain under all deterministic policies. This problem is NP-hard [11]. This paper provides p...
Eugene A. Feinberg, Fenghsu Yang