Sciweavers

87 search results - page 9 / 18
» Hybrid Least-Squares Algorithms for Approximate Policy Evalu...
Sort
View
ECML
2005
Springer
14 years 1 months ago
Natural Actor-Critic
This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari...
Jan Peters, Sethu Vijayakumar, Stefan Schaal
ISSAC
2007
Springer
153views Mathematics» more  ISSAC 2007»
14 years 2 months ago
On exact and approximate interpolation of sparse rational functions
The black box algorithm for separating the numerator from the denominator of a multivariate rational function can be combined with sparse multivariate polynomial interpolation alg...
Erich Kaltofen, Zhengfeng Yang
JAIR
2008
126views more  JAIR 2008»
13 years 7 months ago
Optimal and Approximate Q-value Functions for Decentralized POMDPs
Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent ...
Frans A. Oliehoek, Matthijs T. J. Spaan, Nikos A. ...
IEEECIT
2010
IEEE
13 years 6 months ago
Predictive and Dynamic Resource Allocation for Enterprise Applications
—Dynamic resource allocation has the potential to provide significant increases in total revenue in enterprise systems through the reallocation of available resources as the dem...
M. Al-Ghamdi, Adam P. Chester, Stephen A. Jarvis
ATAL
2007
Springer
14 years 2 months ago
Commitment-driven distributed joint policy search
Decentralized MDPs provide powerful models of interactions in multi-agent environments, but are often very difficult or even computationally infeasible to solve optimally. Here we...
Stefan J. Witwicki, Edmund H. Durfee